首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We study the problem of mapping theNnodes of a data structure onMmemory modules so that they can be accessed in parallel bytemplates, i.e., distinct sets of nodes. In literature several algorithms are available for arrays (accessed by rows, columns, diagonals, and subarrays) and trees (accessed by subtrees, root-to-leaf paths, levels, etc.). Although some mapping algorithms for arrays allow conflict-free access to several templates at once (for example rows and columns), no mapping algorithm is known for efficiently accessing subtree, path and level templates in complete binary trees. In our paper, we first prove that any mapping algorithm that is conflict-free for tree/level template has Ω(M/logM) conflicts when access is done according to path template and vice versa. Therefore, no mapping algorithm can be found that is conflict-free on both path and tree (or path and level) templates. Our main result is an algorithm for mapping complete binary trees withN= 2M− 1 nodes onMmemory modules in such a way that:
  • •the number of conflicts for accessing an-node subtree,adjacent nodes in the same level, orconsecutive nodes of a root-to-leaf path is(),
  • •the load (i.e., the ratio between the maximum and minimum number of data items mapped on each module) is 1 + o(1),
  • •the time complexity for retrieving the module where a given data item is stored is(1), if a preprocessing phase of space and time complexity(log) is executed, or(log log), if no preprocessing is allowed.
The algorithm can be easily generalized to complete binary trees of any size.  相似文献   

2.
Techniques are developed for mapping structured data to an ensemble of parallel memory modules in a way that limits the number of conflicts, i.e., simultaneous accesses by distinct processors to the same memory module. The techniques determine, for any given conflict tolerance c, the smallest ensemble that allows one to store any n-node data structure "of type X" in such a way that no more than c nodes of a structure are stored on the same module. This goal is achieved by determining the smallest c-perfect universal graphs for data structures "of type X." Such a graph is the smallest graph that contains a homomorphic image of each n-node structure "of type X" with each node of the image holding < c nodes of the structure. In the current paper, "type X" refers to rooted binary trees and three array-like structures: chaotic arrays, ragged arrays, and rectangular arrays. For each of these families of data structures, the number of memory modules needed to achieve conflict tolerance c is determined to within constant factors.  相似文献   

3.
In this paper, we give a solution of the Firing Squad Synchronization Problem for graphs. The synchronization times of solutions which have been obtained are proportional to the number of nodes of a graph. The synchronization time of our solution is proportional to the radius rG of a graph (G (3rG + 1 or 3rG time units, where rG, is the longest distance between the general and any other node of G. This synchronization time is minimum for an infinite number of graphs.  相似文献   

4.
The main memory access latency can significantly slow down the overall performance of a computer system due to the fact that average cycle time of the main memory is typically a factor of 5–10 times higher than that of a processor. To cope with this problem, in addition to the use of caches, the main memory of a multiprocessor architecture is usually organized into multiple modules or banks. Although such organization enhances memory bandwidth, the amount of data that the multiprocessor can retrieve in the same memory cycle, conflicts due to simultaneous attempts to access the same memory module may reduce the effective bandwidth. Therefore, efficient mapping schemes are required to distribute data in such a way that regular patterns, called templates, of various structures can be retrieved in parallel without memory conflicts. Prior work on data mappings mostly dealt with conflict-free access to templates such as rows, columns, or diagonals of (multidimensional) arrays, and only limited attention has been paid to access templates of nonnumeric structures such as trees. In this paper, we study optimal and balanced mappings for accessing path and subtree templates of trees, where a mapping will be called optimal if it allows conflict-free access to templates with as few memory banks as possible. An optimal mapping will also be called balanced if it distributes as evenly as possible the nodes of the entire tree among the memory banks available. In particular, based on Latinsquares, we propose an optimal and balanced mapping for leaf-to-root paths of q-ary trees. Another (recursive) mapping for leaf-to-root paths of binary trees raises interesting combinatorial problems. We also derive an optimal and balanced mapping to access complete t-ary subtrees of complete q-ary trees, where 2⩽tq, and an optimal mapping for subtrees of binomial trees.  相似文献   

5.
In this paper we consider the problem of on-line graph coloring. In an instance of on-line graph coloring, the nodes are presented one at a time. As each node is presented, its edges to previously presented nodes are also given. Each node must be assigned a color, different from the colors of its neighbors, before the next node is given. LetA(G) be the number of colors used by algorithmA on a graphG and letx(G) be the chromatic number ofG. The performance ratio of an on-line graph coloring algorithm for a class of graphsC is maxG C(A(G)/(G)). We consider the class ofd-inductive graphs. A graphG isd-inductive if the nodes ofG can be numbered so that each node has at mostd edges to higher-numbered nodes. In particular, planar graphs are 5-inductive, and chordal graphs arex(G)-inductive. First Fit is the algorithm that assigns each node the lowest-numbered color possible. We show that ifG isd-inductive, then First Fit usesO(d logn) colors onG. This yields an upper bound ofo(logn) on the performance ratio of First Fit on chordal and planar graphs. First Fit does as well as any on-line algorithm ford-inductive graphs: we show that, for anyd and any on-line graph coloring algorithmA, there is ad-inductive graph that forcesA to use (d logn) colors to colorG. We also examine on-line graph coloring with lookahead. An algorithm is on-line with lookaheadl, if it must color nodei after examining only the firstl+i nodes. We show that, forl/logn, the lower bound ofd logn colors still holds.This research was supported by an IBM Graduate Fellowship.  相似文献   

6.
In this paper, we study efficient strategies for mapping onto parallel memory systems complete trees that are accessed by fixed templates (like complete subtrees, paths, or any combinations their of). These mappings are evaluated with respect to the following criteria: (1) the largest number of data items that can be accessed in parallel without memory conflicts; (2) the number of memory conflicts that can occur when accessing templates of size equal to the number of available memory modules, thereby exploiting the full parallelism of the system; (3) the complexity of the memory addressing scheme, i.e., the cost of retrieving the module where a given data item is mapped. We show that there exist trade-offs between these three criteria and the performance of different mapping strategies depends on the emphasis given on each of these criteria. More specifically, we describe an algorithm for mapping complete binary trees of height H onto M memory modules and prove that it achieves the following performance results: (1) conflict-free access to complete subtrees of size K and paths of size N such that N + K - [log K] ⩽ M; (2) at most 1 conflict in accessing complete subtrees and paths of size M; (3) O(K/M + c) conflicts when accessing a composite template of K nodes consisting of c disjoint subsets, each subset being a complete subtree, or a path or a set of consecutive nodes in a level of the tree  相似文献   

7.
The main results of this paper establish relationships between the bandwidth of a graphG — which is the minimum over all layouts ofG in a line of the maximum distance between images of adjacent vertices ofG — and the ease of playing various pebble games onG. Three pebble games on graphs are considered: the well-known computational pebble game, the “progressive” (i.e., no recomputation allowed) version of the computational pebble game, both of which are played on directed acyclic graphs, and the quite different “breadth-first” pebble game, that is played on undirected graphs. We consider two costs of a play of a pebble game: the minimum number of pebbles needed to play the game on the graphG, and the maximumlifetime of any pebble in the game, i.e., the maximum number of moves that any pebble spends on the graph. The first set of results of the paper prove that the minimum lifetime cost of a play of either of the second two pebble games on a graphG is precisely the bandwidth ofG. The second set of results establish bounds on the pebble demand of all three pebble games in terms of the bandwidth of the graph being pebbled; for instance, the number of pebbles needed to pebble a graphG of bandwidthk is at most min (2k 2+k+1, 2k log2|G|); and, in addition, there are bandwidth-k graphs that require 3k?1 pebbles. The third set of results relate the difficulty of deciding the cost of playing a pebble game on a given input graphG to the bandwidth ofG; for instance, the Pebble Demand problem forn-vertex graphs of bandwidthf(n) is in the class NSPACE (f(n) log2 n); and the Optimal Lifetime Problem for either of the second two pebble games is NP-complete.  相似文献   

8.
A spanning tree T of a graph G=(V,E) is called a locally connected spanning tree if the set of all neighbors of v in T induces a connected subgraph of G for all vV. The problem of recognizing whether a graph admits a locally connected spanning tree is known to be NP-complete even when the input graphs are restricted to chordal graphs. In this paper, we propose linear time algorithms for finding locally connected spanning trees in cographs, complements of bipartite graphs and doubly chordal graphs, respectively.  相似文献   

9.
We present a new critical section protocol designed for distributed systems with general topologies, where the physical layer is implemented as point-to-point physical links in contrast to shared access physical media. The protocol operates correctly for any topology; however, its time performance is topology dependent. The distributed system can be modeled by a graph G(V, E), where V denotes the set of processors and E is the set of bidirectional communication links. We use n to denote |V|; D(G) is the diameter of G, T(G) is the spanning tree of G, and D(T) is the diameter of T(G). An important measure of the performance of the protocol is the amount of traffic caused by its operation. Let message-hop be the amount of traffic generated by a single message between two adjacent nodes. The proposed protocol generates network traffic of only 3*(n − 1) ∈ Θ(n) [message-hops] per critical section access for any topology which is less than other existing fully distributed protocols. A lower bound on traffic for a single critical section access for a fully distributed protocol is shown to be 2*(n − 1) [message-hops]. Some previously published algorithms generate Θ(n2) [message-hops] of network traffic for some topologies. Another important measure of the performance of the protocol is the cs-access time. It is the time required to access the critical section in the absence of other requests; and it depends on the topology. The high cs-access time performance is achieved by taking a novel approach of distributing the communication and parts of computation functions of the protocol and exploiting the physical topology. For a constant size message, the time to traverse an edge, including the message communication software processing in the source and destination nodes, is called message-hop-time and it is denoted by th. For a general graph G (with spanning tree T) the new protocol has the cs-access time performance Θ(max(D(T), max(deg (vi)))) [th], where deg(vi) is computed in T. For the graphs where G has D(G) ∈ Θ(log2n) and max(deg(vi)) in G is O(log2n), the cs-access time performance is Θ(log2n) [th]. For the class of graphs where G has D(G) ∈ Θ(n), the cs-access time performance is Θ(n) [th]. For the Star graphs the cs-access time performance is Θ(n) [th]. The worst case time performance occurs for linear and Star graphs. The proposed protocol has a better network traffic performance and (depending on the topology) a better or equal cs-access time performance than previously published fully distributed protocols. The protocol keeps the clock bounded in well-designed systems using a distributed predictive "clock squashing" mechanism.  相似文献   

10.
Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.  相似文献   

11.
In this paper we study collective additive tree spanners for special families of graphs including planar graphs, graphs with bounded genus, graphs with bounded tree-width, graphs with bounded clique-width, and graphs with bounded chordality. We say that a graph G=(V,E) admits a system of μ collective additive tree r -spanners if there is a system $\mathcal{T}(G)In this paper we study collective additive tree spanners for special families of graphs including planar graphs, graphs with bounded genus, graphs with bounded tree-width, graphs with bounded clique-width, and graphs with bounded chordality. We say that a graph G=(V,E) admits a system of μ collective additive tree r -spanners if there is a system T(G)\mathcal{T}(G) of at most μ spanning trees of G such that for any two vertices x,y of G a spanning tree T ? T(G)T\in\mathcal{T}(G) exists such that d T (x,y)≤d G (x,y)+r. We describe a general method for constructing a “small” system of collective additive tree r-spanners with small values of r for “well” decomposable graphs, and as a byproduct show (among other results) that any weighted planar graph admits a system of O(?n)O(\sqrt{n}) collective additive tree 0-spanners, any weighted graph with tree-width at most k−1 admits a system of klog 2 n collective additive tree 0-spanners, any weighted graph with clique-width at most k admits a system of klog 3/2 n collective additive tree (2w)(2\mathsf{w}) -spanners, and any weighted graph with size of largest induced cycle at most c admits a system of log 2 n collective additive tree (2?c/2?w)(2\lfloor c/2\rfloor\mathsf{w}) -spanners and a system of 4log 2 n collective additive tree (2(?c/3?+1)w)(2(\lfloor c/3\rfloor +1)\mathsf {w}) -spanners (here, w\mathsf{w} is the maximum edge weight in G). The latter result is refined for weighted weakly chordal graphs: any such graph admits a system of 4log 2 n collective additive tree (2w)(2\mathsf{w}) -spanners. Furthermore, based on this collection of trees, we derive a compact and efficient routing scheme for those families of graphs.  相似文献   

12.
《国际计算机数学杂志》2012,89(1-4):255-268
Parallel Breadth-First Search (BFS) algorithms for ordered trees and graphs on a shared memory model of a Single Instruction-stream Multiple Data-stream computer are proposed. The parallel BFS algorithm for trees computes the BFS rank of eachnode of an ordered tree consisting of n nodes in time of 0(β log n) when 0(n 1+1/β) processors are used, β being an integer greater than or equal to 2. The parallel BFS algorithm for graphs produces Breadth-First Spanning Trees (BFSTs) of a directedgraph G having n nodes in time 0(log d.log n) using 0(n 3) processors, where d is the diameter of G If G is a strongly connected graph or a connected undirected graph the BFS algorithm produces n BFSTs, each BFST having a different start node.  相似文献   

13.
The Min-Min problem of finding a disjoint-path pair with the length of the shorter path minimized is known to be NP-hard and admits no K-approximation for any K>1 in the general case (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006). In this paper, we first show that Bhatia et al.’s NP-hardness proof (Bhatia et al. in J. Comb. Optim. 12:83–96, 2006), a claim of correction to Xu et al.’s proof (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006), for the edge-disjoint Min-Min problem in the general undirected graphs is incorrect by giving a counter example that is an unsatisfiable 3SAT instance but classified as a satisfiable 3SAT instance in the proof of Bhatia et al. (J. Comb. Optim. 12:83–96, 2006). We then gave a correct proof of NP-hardness of this problem in undirected graphs. Finally we give a polynomial-time algorithm for the vertex disjoint Min-Min problem in planar graphs by showing that the vertex disjoint Min-Min problem is polynomially solvable in st-planar graph G=(V,E) whose corresponding auxiliary graph G(V,E∪{e(st)}) can be embedded into a plane, and a planar graph can be decomposed into several st-planar graphs whose Min-Min paths collectively contain a Min-Min disjoint-path pair between s and t in the original graph G. To the best of our knowledge, these are the first polynomial algorithms for the Min-Min problems in planar graphs.  相似文献   

14.
The degree of a graph H is the maximum among the degrees of its nodes. A set of graphs L is of bounded degree if there exists a positive integer n such that the degree of each graph in L does not exceed n. We demonstrate that it is decidable whether or not the (graph) language of an arbitrary node label controlled (NLC) grammar is of bounded degree. Moreover, it is shown that, given an arbitrary NLC grammar G generating the language L(G) of bounded degree, one can effectively compute the maximum integer which appears as the degree of a graph in L(G).  相似文献   

15.
A graph G∗ is 1-edge fault-tolerant with respect to a graph G, denoted by 1-EFT(G), if every graph obtained by removing any edge from G∗ contains G. A 1-EFT(G) graph is optimal if it contains the minimum number of edges among all 1-EFT(G) graphs. The kth ladder graph, Lk, is defined to be the cartesian product of the Pk and P2 where Pn is the n-vertex path graph. In this paper, we present several 1-edge fault-tolerant graphs with respect to ladders. Some of these graphs are proven to be optimal.  相似文献   

16.
In this paper, we investigate three strategies of how to use a spanning tree T of a graph G to navigate in G, i.e., to move from a current vertex x towards a destination vertex y via a path that is close to optimal. In each strategy, each vertex v has full knowledge of its neighborhood N G [v] in G (or, k-neighborhood D k (v,G), where k is a small integer) and uses a small piece of global information from spanning tree T (e.g., distance or ancestry information in T), available locally at v, to navigate in G. We investigate advantages and limitations of these strategies on particular families of graphs such as graphs with locally connected spanning trees, graphs with bounded length of largest induced cycle, graphs with bounded tree-length, graphs with bounded hyperbolicity. For most of these families of graphs, the ancestry information from a Breadth-First-Search-tree guarantees short enough routing paths. In many cases, the obtained results are optimal up to a constant factor.  相似文献   

17.
A graph G is the k-leaf power of a tree T if its vertices are leaves of T such that two vertices are adjacent in G if and only if their distance in T is at most k. Then T is the k-leaf root of G. This notion was introduced and studied by Nishimura, Ragde, and Thilikos motivated by the search for underlying phylogenetic trees. Their results imply a O(n3) time recognition algorithm for 3-leaf powers. Later, Dom, Guo, Hüffner, and Niedermeier characterized 3-leaf powers as the (bull, dart, gem)-free chordal graphs. We show that a connected graph is a 3-leaf power if and only if it results from substituting cliques into the vertices of a tree. This characterization is much simpler than the previous characterizations via critical cliques and forbidden induced subgraphs and also leads to linear time recognition of these graphs.  相似文献   

18.
Let G be an undirected graph and $\mathcal{T}=\{T_{1},\ldots,T_{k}\}Let G be an undirected graph and T={T1,?,Tk}\mathcal{T}=\{T_{1},\ldots,T_{k}\} be a collection of disjoint subsets of nodes. Nodes in T 1⋅⋅⋅T k are called terminals, other nodes are called inner. By a T\mathcal{T} -path we mean a path P such that P connects terminals from distinct sets in T\mathcal{T} and all internal nodes of P are inner. We study the problem of finding a maximum cardinality collection ℘ of T\mathcal{T} -paths such that at most two paths in ℘ pass through any node. Our algorithm is purely combinatorial and has the time complexity O(mn 2), where n and m denote the numbers of nodes and edges in G, respectively.  相似文献   

19.
A “book-embedding” of a graph G comprises embedding the graph's nodes along the spine of a book and embedding the edges on the pages so that the edges embedded on the same page do not intersect. This is also referred to as the page model. The “pagenumber” of a graph is the thickness of the smallest (in number of pages) book into which G can be embedded. The problem has been studied only for some specific kind of graphs. The pagenumber problem is known to be NP-complete, even if the order of nodes on the spine is fixed. Using genetic algorithms, we describe the first algorithm for solving the pagenumber problem that can be applied on arbitrary graphs. Experimental results for several kinds of graphs are obtained. We were particularly interested in graphs that correspond to some well-known interconnection networks (such as hypercubes and meshes). We also introduced and experimented with 2-D pagenumber model for embedding graphs.  相似文献   

20.
Although nonuniform memory access architecture provides better scalability for multicore systems, cores accessing memory on remote nodes take longer than those accessing on local nodes. Remote memory access accompanied by contention for internode interconnection degrades performance. Properly mapping threads to cores and data accessed to their nodes can substantially improve performance and energy efficiency. However, an operating system kernel's load-balancing activity may migrate threads across nodes, which thus messes up the thread mapping. Besides, subsequent data mapping behavior pays for the cost of page migration to reduce remote memory access. Once unsuitable threads are migrated, it is detrimental to system performance. This paper focuses on improving the kernel's internode load balancing on nonuniform memory access systems. We develop a memory-aware kernel mechanism and policies to reduce remote memory access incurred by internode thread migration. The Linux kernel's load balancing mechanism is modified to incorporate selection policies in the internode thread migration, and the kernel is modified to track the amount of memory used by each thread on each node. With this information, well-designed policies can then choose suitable threads for internode migration. The purpose is to avoid migrating a thread that might incur relatively more remote memory access and page migration. The experimental results show that with our mechanism and the proposed selection policies, the system performance is substantially increased when compared with the unmodified Linux kernel that does not consider memory usage and always migrates the first-fit thread in the runqueue that can be migrated to the target central processing unit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号