首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Given an unlabeled, unweighted, and undirected graph with n vertices and small (but not necessarily constant) treewidth k, we consider the problem of preprocessing the graph to build space-efficient encodings (oracles) to perform various queries efficiently. We assume the word RAM model where the size of a word is Ω(logn) bits. The first oracle, we present, is the navigation oracle which facilitates primitive navigation operations of adjacency, neighborhood, and degree queries. By way of an enumeration argument, which is of interest in its own right, we show the space requirement of the oracle is optimal to within lower order terms for all graphs with n vertices and treewidth k. The oracle supports the mentioned queries all in constant worst-case time. The second oracle, we present, is an exact distance oracle which facilitates distance queries between any pair of vertices (i.e., an all-pairs shortest-path oracle). The space requirement of the oracle is also optimal to within lower order terms. Moreover, the distance queries perform in O(k 3log3 k) time. Particularly, for the class of graphs of popular interest, graphs of bounded treewidth (where k is constant), the distances are reported in constant worst-case time.  相似文献   

在涉及复杂图(graph)数据的场景中,图的距离查询和路径查询有着重要的应用.有些应用涉及到规模巨大的图,并且要求快速的查询响应.为此需要高效的查询策略.通过研究可以发现,图内部节点的重要程度往往是不同的,并且可以利用节点的穿行次数度量节点的重要性.根据穿行次数为节点构建标签,并保证仅根据节点标签就能处理图的距离查询和路径查询,从而避免对图的遍历,这是一个基本的查询策略.这些标签的规模要尽量小,以降低空间开销、提高查询速度;而其构建过程却要足够快,以保证构建效率.将这个基于穿行次数的查询处理策略称为穿行次数算法,最终的实验结果验证了该算法的有效性.  相似文献   

We present a novel algorithm to solve the non-negative single-source shortest path problem on road networks and graphs with low highway dimension. After a quick preprocessing phase, we can compute all distances from a given source in the graph with essentially a linear sweep over all vertices. Because this sweep is independent of the source, we are able to reorder vertices in advance to exploit locality. Moreover, our algorithm takes advantage of features of modern CPU architectures, such as SSE and multiple cores. Compared to Dijkstra’s algorithm, our method needs fewer operations, has better locality, and is better able to exploit parallelism at multi-core and instruction levels. We gain additional speedup when implementing our algorithm on a GPU, where it is up to three orders of magnitude faster than Dijkstra’s algorithm on a high-end CPU. This makes applications based on all-pairs shortest-paths practical for continental-sized road networks. Several algorithms, such as computing the graph diameter, arc flags, or exact reaches, can be greatly accelerated by our method.  相似文献   

In this paper, we study a variant of reachability queries, called label-constraint reachability (LCR) queries. Specifically, given a label set S and two vertices u1 and u2 in a large directed graph G, we check the existence of a directed path from u1 to u2, where edge labels along the path are a subset of S. We propose the path-label transitive closure method to answer LCR queries. Specifically, we t4ransform an edge-labeled directed graph into an augmented DAG by replacing the maximal strongly connected components as bipartite graphs. We also propose a Dijkstra-like algorithm to compute path-label transitive closure by re-defining the “distance” of a path. Comparing with the existing solutions, we prove that our method is optimal in terms of the search space. Furthermore, we propose a simple yet effective partition-based framework (local path-label transitive closure+online traversal) to answer LCR queries in large graphs. We prove that finding the optimal graph partition to minimize query processing cost is a NP-hard problem. Therefore, we propose a sampling-based solution to find the sub-optimal partition. Moreover, we address the index maintenance issues to answer LCR queries over the dynamic graphs. Extensive experiments confirm the superiority of our method.  相似文献   

Since today’s real-world graphs, such as social network graphs, are evolving all the time, it is of great importance to perform graph computations and analysis in these dynamic graphs. Due to the fact that many applications such as social network link analysis with the existence of inactive users need to handle failed links or nodes, decremental computation and maintenance for graphs is considered a challenging problem. Shortest path computation is one of the most fundamental operations for managing and analyzing large graphs. A number of indexing methods have been proposed to answer distance queries in static graphs. Unfortunately, there is little work on answering such queries for dynamic graphs. In this paper, we focus on the problem of computing the shortest path distance in dynamic graphs, particularly on decremental updates (i.e., edge deletions). We propose maintenance algorithms based on distance labeling, which can handle decremental updates efficiently. By exploiting properties of distance labeling in original graphs, we are able to efficiently maintain distance labeling for new graphs. We experimentally evaluate our algorithms using eleven real-world large graphs and confirm the effectiveness and efficiency of our approach. More specifically, our method can speed up index re-computation by up to an order of magnitude compared with the state-of-the-art method, Pruned Landmark Labeling (PLL).  相似文献   

An antimagic labeling of a connected graph with m edges is an injective assignment of labels from {1,…,m} to the edges such that the sums of incident labels are distinct at distinct vertices. Hartsfield and Ringel conjectured that every connected graph other than K2 has an antimagic labeling. We prove this for the classes of split graphs and graphs decomposable under the canonical decomposition introduced by Tyshkevich. As a consequence, we provide a sufficient condition on graph degree sequences to guarantee an antimagic labeling.  相似文献   

Embar  Varun  Srinivasan  Sriram  Getoor  Lise 《Machine Learning》2021,110(7):1847-1866

Statistical relational learning (SRL) and graph neural networks (GNNs) are two powerful approaches for learning and inference over graphs. Typically, they are evaluated in terms of simple metrics such as accuracy over individual node labels. Complex aggregate graph queries (AGQ) involving multiple nodes, edges, and labels are common in the graph mining community and are used to estimate important network properties such as social cohesion and influence. While graph mining algorithms support AGQs, they typically do not take into account uncertainty, or when they do, make simplifying assumptions and do not build full probabilistic models. In this paper, we examine the performance of SRL and GNNs on AGQs over graphs with partially observed node labels. We show that, not surprisingly, inferring the unobserved node labels as a first step and then evaluating the queries on the fully observed graph can lead to sub-optimal estimates, and that a better approach is to compute these queries as an expectation under the joint distribution. We propose a sampling framework to tractably compute the expected values of AGQs. Motivated by the analysis of subgroup cohesion in social networks, we propose a suite of AGQs that estimate the community structure in graphs. In our empirical evaluation, we show that by estimating these queries as an expectation, SRL-based approaches yield up to a 50-fold reduction in average error when compared to existing GNN-based approaches.


Finding maximal homogeneous clique sets   总被引:1,自引:0,他引:1  
Many datasets can be encoded as graphs with sets of labels associated with the vertices. We consider this kind of graphs and we propose to look for patterns called maximal homogeneous clique sets, where such a pattern is a subgraph that is structured in several large cliques and where all vertices share enough labels. We present an algorithm based on graph enumeration to compute all patterns satisfying user-defined constraints on the number of separated cliques, on the size of these cliques, and on the number of labels shared by all the vertices. Our approach is tested on real datasets based on a social network of scientific collaborations and on a biological network of protein–protein interactions. The experiments show that the patterns are useful to exhibit subgraphs organized in several core modules of interactions. Performances are reported on real data and also on synthetic ones, showing that the approach can be applied on different kinds of large datasets.  相似文献   

判断图同构的一种有用的方法是对图的邻接矩阵进行初等变换,变成另一个图的邻接矩阵。不幸的是,当初等变换后两个矩阵不能相等时,并不能说明两个图不同构,因为可能存在另一种变换途径,使得两个矩阵相等。另一方面,这种穷尽变换途径的方法有n!种可能(n为图的顶点个数);当n太大时,尝试每一种可能来说明两个图是否同构是不可行的,是一个NP难问题。文章提出了一个简单有效的判断图同构的方法。首先,利用邻接矩阵生成行码距异或矩阵和行码距同或矩阵;其次,寻找邻接矩阵、行码距异或矩阵、行码距同或矩阵间保持行元素一样的行-行置换;如果这种置换存在,则图同构,否则不同构。最后,根据行-行置换确定出同构函数,它给出了两个图的顶点间具有保持相邻关系的一一对应。  相似文献   

Given a graph, we define a base set to be a set of integers of size equal to the number of vertices in the graph. Given a graph and a base set, a labeling of the graph from the base set is an assignment of distinct integers from the base set to the vertices of the graph. The gap of an edge in a labeled graph is the absolute value of the difference between the labels of its endpoints. The gap of a labeled graph is the sum of the gaps of its edges.The maximum gap graph labeling problem takes as input a graph and a base set and maximizes the gap of the graph over all possible labelings from the base set. We show that this problem is NP-complete even when the base set is restricted to consecutive integers. We also show that this restricted case has polynomial time approximations that achieve a factor of 2/3 for trees, of 1/2 for bipartite graphs, and of 1/4 for general graphs, with a deterministic algorithm, while an expected factor of 1/3 for general graphs is achieved with a randomized algorithm. The case of general base sets is approximated within an expected factor of 1/16 for general graphs with a randomized polynomial time algorithm. We finally give a polynomial time algorithm that solves the maximum gap graph labeling problem for a graph that has bounded degree and bounded treewidth. The maximum graph labeling problem shows connections with the graceful tree conjecture.  相似文献   

We proposed a novel solution schema called the Hierarchical Labeling Schema (HLS) to answer reachability queries in directed graphs. Different from many existing approaches that focus on static directed acyclic graphs (DAGs), our schema focuses on directed cyclic graphs (DCGs) where vertices or arcs could be added to a graph incrementally. Unlike many of the traditional approaches, HLS does not require the graph to be acyclic in constructing its index. Therefore, it could, in fact, be applied to both DAGs and DCGs. When vertices or arcs are added to a graph, the HLS is capable of updating the index incrementally instead of re-computing the index from the scratch each time, making it more efficient than many other approaches in the practice. The basic idea of HLS is to create a tree for each vertex in a graph and link the trees together so that whenever two vertices are given, we can immediately know whether there is a path between them by referring to the appropriate trees. We conducted extensive experiments on both real-world datasets and synthesized datasets. We compared the performance of HLS, in terms of index construction time, query processing time and space consumption, with two state-of-the-art methodologies, the path-tree method and the 3-hop method. We also conducted simulations to model the situation when a graph is updated incrementally. The performance comparison of different algorithms against HLS on static graphs has also been studied. Our results show that HLS is highly competitive in the practice and is particularly useful in the cases where the graphs are updated frequently.  相似文献   

We show that the vertices of an edge-weighted undirected graph can be labeled with labels of size O(n) such that the exact distance between any two vertices can be inferred from their labels alone in time. This improves the previous best exact distance labeling scheme that also requires O(n)-sized labels but time to compute the distance. Our scheme is almost optimal as exact distance labeling is known to require labels of length Ω(n).  相似文献   

Distance labeling schemes are composed of a marker algorithm for labeling the vertices of a graph with short labels, coupled with a decoder algorithm allowing one to compute the distance between any two vertices directly from their labels (without using any additional information). As applications for distance labeling schemes concern mainly large and dynamically changing networks, it is of interest to study distributed dynamic labeling schemes. The current paper considers the problem on dynamic trees, and proposes efficient distributed schemes for it. The paper first presents a labeling scheme for distances in the dynamic tree model, with amortized message complexity O(log2 n) per operation, where n is the size of the tree at the time the operation takes place. The protocol maintains O(log2 n) bit labels. This label size is known to be optimal even in the static scenario. A more general labeling scheme is then introduced for the dynamic tree model, based on extending an existing static tree labeling scheme to the dynamic setting. The approach fits a number of natural tree functions, such as distance, separation level, and flow. The main resulting scheme incurs an overhead of an O(log n) multiplicative factor in both the label size and amortized message complexity in the case of dynamically growing trees (with no vertex deletions). If an upper bound on n is known in advance, this method yields a different tradeoff, with an O(log2 n/log log n) multiplicative overhead on the label size but only an O(log n/log log n) overhead on the amortized message complexity. In the fully dynamic model the scheme also incurs an increased additive overhead in amortized communication, of O(log2 n) messages per operation.  相似文献   

In many applications, the properties of an object being modeled are stored as labels on vertices or edges of a graph. In this paper, we consider succinct representation of labeled graphs. Our main results are the succinct representations of labeled and multi-labeled graphs (we consider planar triangulations, planar graphs and k-page graphs) to support various label queries efficiently. The additional space cost to store the labels is essentially the information-theoretic minimum. As far as we know, our representations are the first succinct representations of labeled graphs. We also have two preliminary results to achieve the main contribution. First, we design a succinct representation of unlabeled planar triangulations to support the rank/select of edges in ccw (counter clockwise) order in addition to the other operations supported in previous work. Second, we design a succinct representation for a k-page graph when k is large to support various navigational operations more efficiently. In particular, we can test the adjacency of two vertices in O(lg?k) time, while previous work uses O(k) time.  相似文献   

Let G=(V,E) be a weighted undirected graph, with non-negative edge weights. We consider the problem of efficiently computing approximate distances between all pairs of vertices in?G. While many efficient algorithms are known for this problem in unweighted graphs, not many results are known for this problem in weighted graphs. Zwick?(J. Assoc. Comput. Mach. 49:289–317, 2002) showed that for any fixed ε>0, stretch 1+ε distances (a path in G between u,vV is said to be of stretch t if its length is at most t times the distance between u and v in G) between all pairs of vertices in a weighted directed graph on n vertices can be computed in $\tilde{O}(n^{\omega})$ time, where ω<2.376 is the exponent of matrix multiplication and n is the number of vertices. It is known that finding distances of stretch less than 2 between all pairs of vertices in G is at least as hard as Boolean matrix multiplication of two n×n matrices. Here we show that all pairs stretch 2+ε distances for any fixed ε>0 in G can be computed in expected time O(n 9/4). This algorithm uses a fast rectangular matrix multiplication subroutine. We also present a combinatorial algorithm (that is, it does not use fast matrix multiplication) with expected running time O(n 9/4) for computing all-pairs stretch 5/2 distances in?G. This combinatorial algorithm will serve as a key step in our all-pairs stretch 2+ε distances algorithm.  相似文献   

The ever increasing size of graphs makes them difficult to query and store. In this paper, we present Shrink, a compression method that reduces the size of the graph while preserving the distances between the nodes. The compression is based on the iterative merging of the nodes. During each merging, a system of linear equations is solved to define new edge weights in a way that the new weights have the least effect on the distances. Merging nodes continues until the desired size for the compressed graph is reached. The compressed graph, also known as the coarse graph, can be queried without decompression. As the complexity of distance-based queries such as shortest path queries is highly dependent on the size of the graph, Shrink improves the performance in terms of time and storage. Shrink not only provides the length of the shortest path but also identifies the nodes on the path. The approach has been applied to both weighted and unweighted graphs including road network, friendship network, collaboration network, web graph and social network. In the experiment, a road network with more than 2.5 million nodes is reduced to fifth while the average relative error is less than 1%.  相似文献   

We study the problem of answering k -hop reachability queries in a directed graph, i.e., whether there exists a directed path of length $k$ , from a source query vertex to a target query vertex in the input graph. The problem of $k$ -hop reachability is a general problem of the classic reachability (where $k=\infty $ ). Existing indexes for processing classic reachability queries, as well as for processing shortest path distance queries, are not applicable or not efficient for processing $k$ -hop reachability queries. We propose an efficient index for processing $k$ -hop reachability queries. Our experimental results on a wide range of real datasets show that our method is efficient and scalable in terms of both index construction and query processing.  相似文献   

Motivated by a problem of targeted advertising in social networks, we introduce a new model of online learning on labeled graphs where the graph is initially unknown and the algorithm is free to choose which vertex to predict next. For this learning model, we define an appropriate measure of regularity of a graph labeling called the merging degree. In general, the merging degree of a graph is small when its vertices can be partitioned into a few well-separated clusters within which labels are roughly constant. For the special case of binary labeled graphs, the merging degree is a more refined measure than the cutsize. After observing that natural nonadaptive exploration/prediction strategies, like depth-first with majority vote, do not behave satisfactorily on graphs with small merging degree, we introduce an efficiently implementable adaptive strategy whose cumulative loss is controlled by the merging degree. A matching lower bound shows that in the case of binary labels our analysis cannot be improved.  相似文献   

一种多到一子图同构检测方法   总被引:3,自引:0,他引:3  
张硕  李建中  高宏  邹兆年 《软件学报》2010,21(3):401-414
提出一种方法来解决从多个小图到一个大图的子图同构检测问题,其中多个小图是预先给定的,而大图是用户在线提交的.首先,基于DFS 编码提出一种小图集合的压缩组织方法;其次,提出一种带有前向剪枝技术的从多个小图到一个大图的子图同构检测算法.另外,给出一种有效的基于数据挖掘的索引技术.分析和实验结果证实,所提出方法的在线计算代价远小于现有方法,在线执行时间比现有方法快约一个数量级,离线构造时间快一个数量级以上.  相似文献   

We present a parallel toolkit for pairwise distance computation in massive networks. Computing the exact shortest paths between a large number of vertices is a costly operation, and serial algorithms are not practical for billion‐scale graphs. We first describe an efficient parallel method to solve the single source shortest path problem on commodity hardware with no shared memory. Using it as a building block, we introduce a new parallel algorithm to estimate the shortest paths between arbitrary pairs of vertices. Our method exploits data locality, produces highly accurate results, and allows batch computation of shortest paths with 7% average error in graphs that contain billions of edges. The proposed algorithm is up to two orders of magnitude faster than previously suggested algorithms and does not require large amounts of memory or expensive high‐end servers. We further leverage this method to estimate the closeness and betweenness centrality metrics, which involve systems challenges dealing with indexing, joining, and comparing large datasets efficiently. In one experiment, we mined a real‐world Web graph with 700 million nodes and 12 billion edges to identify the most central vertices and calculated more than 63 billion shortest paths in 6 h on a 20‐node commodity cluster. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号