首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
How do the k-core structures of real-world graphs look like? What are the common patterns and the anomalies? How can we exploit them for applications? A k-core is the maximal subgraph in which all vertices have degree at least k. This concept has been applied to such diverse areas as hierarchical structure analysis, graph visualization, and graph clustering. Here, we explore pervasive patterns related to k-cores and emerging in graphs from diverse domains. Our discoveries are: (1) Mirror Pattern: coreness (i.e., maximum k such that each vertex belongs to the k-core) is strongly correlated with degree. (2) Core-Triangle Pattern: degeneracy (i.e., maximum k such that the k-core exists) obeys a 3-to-1 power-law with respect to the count of triangles. (3) Structured Core Pattern: degeneracy–cores are not cliques but have non-trivial structures such as core–periphery and communities. Our algorithmic contributions show the usefulness of these patterns. (1) Core-A, which measures the deviation from Mirror Pattern, successfully spots anomalies in real-world graphs, (2) Core-D, a single-pass streaming algorithm based on Core-Triangle Pattern, accurately estimates degeneracy up to 12 \(\times \) faster than its competitor. (3) Core-S, inspired by Structured Core Pattern, identifies influential spreaders up to 17 \(\times \) faster than its competitors with comparable accuracy.  相似文献   

2.
Given a simple undirected graph G = (V, E) and an integer k < |V|, the Sparsest k-Subgraph problem asks for a set of k vertices which induces the minimum number of edges. As a generalization of the classical independent set problem, Sparsest k-Subgraph is ????-hard and even not approximable unless ?????? in general graphs. Thus, we investigate Sparsest k-Subgraph in graph classes where independent set is polynomial-time solvable, such as subclasses of perfect graphs. Our two main results are the ????-hardness of Sparsest k-Subgraph on chordal graphs, and a greedy 2-approximation algorithm. Finally, we also show how to derive a P T A S for Sparsest k-Subgraph on proper interval graphs.  相似文献   

3.
In the framework of parameterized complexity, exploring how one parameter affects the complexity of a different parameterized (or unparameterized problem) is of general interest. A well-developed example is the investigation of how the parameter treewidth influences the complexity of (other) graph problems. The reason why such investigations are of general interest is that real-world input distributions for computational problems often inherit structure from the natural computational processes that produce the problem instances (not necessarily in obvious, or well-understood ways). The max leaf number ml(G) of a connected graph G is the maximum number of leaves in a spanning tree for G. Exploring questions analogous to the well-studied case of treewidth, we can ask: how hard is it to solve 3-Coloring, Hamilton Path, Minimum Dominating Set, Minimum Bandwidth or many other problems, for graphs of bounded max leaf number? What optimization problems are W[1]-hard under this parameterization? We do two things:
  1. (1)
    We describe much improved FPT algorithms for a large number of graph problems, for input graphs G for which ml(G)≤k, based on the polynomial-time extremal structure theory canonically associated to this parameter. We consider improved algorithms both from the point of view of kernelization bounds, and in terms of improved fixed-parameter tractable (FPT) runtimes O *(f(k)).
     
  2. (2)
    The way that we obtain these concrete algorithmic results is general and systematic. We describe the approach, and raise programmatic questions.
     
  相似文献   

4.
The Planar Feedback Vertex Set problem asks whether an n-vertex planar graph contains at most k vertices meeting all its cycles. The Face Cover problem asks whether all vertices of a plane graph G lie on the boundary of at most k faces of G. Standard techniques from parameterized algorithm design indicate that both problems can be solved by sub-exponential parameterized algorithms (where k is the parameter). In this paper we improve the algorithmic analysis of both problems by proving a series of combinatorial results relating the branchwidth of planar graphs with their face cover. Combining this fact with duality properties of branchwidth, allows us to derive analogous results on feedback vertex set. As a consequence, it follows that Planar Feedback Vertex Set and Face Cover can be solved in \(O(2^{15.11\cdot\sqrt{k}}+n^{2})\) and \(O(2^{10.1\cdot\sqrt {k}}+n^{2})\) steps, respectively.  相似文献   

5.
A degree-constrained graph orientation of an undirected graph G is an assignment of a direction to each edge in G such that the outdegree of every vertex in the resulting directed graph satisfies a specified lower and/or upper bound. Such graph orientations have been studied for a long time and various characterizations of their existence are known. In this paper, we consider four related optimization problems introduced in reference (Asahiro et al. LNCS 7422, 332–343 (2012)): For any fixed non-negative integer W, the problems MAX W-LIGHT, MIN W-LIGHT, MAX W-HEAVY, and MIN W-HEAVY take as input an undirected graph G and ask for an orientation of G that maximizes or minimizes the number of vertices with outdegree at most W or at least W. As shown in Asahiro et al. LNCS 7422, 332–343 (2012)).  相似文献   

6.
Given a real world graph, how can we find a large subgraph whose partition quality is much better than the original? How can we use a partition of that subgraph to discover a high quality global partition? Although graph partitioning especially with balanced sizes has received attentions in various applications, it is known NP-hard, and also known that there is no good cut at a large scale for real graphs. In this paper, we propose a novel approach for graph partitioning. Our first focus is on finding a large subgraph with high quality partitions, in terms of conductance. Despite the difficulty of the task for the whole graph, we observe that there is a large connected subgraph whose partition quality is much better than the original. Our proposed method MTP finds such a subgraph by removing “hub” nodes with large degrees, and taking the remaining giant connected component. Further, we extend MTP to gb MTP (Global Balanced MTP) for discovering a global balanced partition. gb MTP attaches the excluded nodes in MTP to the partition found by MTP in a greedy way. In experiments, we demonstrate that MTP finds a subgraph of a large size with low conductance graph partitions, compared with competing methods. We also show that the competitors cannot find connected subgraphs while our method does, by construction. This improvement in partition quality for the subgraph is especially noticeable for large scale cuts—for a balanced partition, down to 14 % of the original conductance with the subgraph size 70 % of the total. As a result, the found subgraph has clear partitions at almost all scales compared with the original. Moreover, gb MTP generally discovers global balanced partitions whose conductance are lower than those found by METIS, the state-of-the-art graph partitioning method.  相似文献   

7.
The paper studies three fundamental problems in graph analytics, computing connected components (CCs), biconnected components (BCCs), and 2-edge-connected components (ECCs) of a graph. With the recent advent of big data, developing efficient distributed algorithms for computing CCs, BCCs and ECCs of a big graph has received increasing interests. As with the existing research efforts, we focus on the Pregel programming model, while the techniques may be extended to other programming models including MapReduce and Spark. The state-of-the-art techniques for computing CCs and BCCs in Pregel incur \(O(m\times \#\text {supersteps})\) total costs for both data communication and computation, where m is the number of edges in a graph and #supersteps is the number of supersteps. Since the network communication speed is usually much slower than the computation speed, communication costs are the dominant costs of the total running time in the existing techniques. In this paper, we propose a new paradigm based on graph decomposition to compute CCs and BCCs with O(m) total communication cost. The total computation costs of our techniques are also smaller than that of the existing techniques in practice, though theoretically almost the same. Moreover, we also study distributed computing ECCs. We are the first to study this problem and an approach with O(m) total communication cost is proposed. Comprehensive empirical studies demonstrate that our approaches can outperform the existing techniques by one order of magnitude regarding the total running time.  相似文献   

8.
The Surjective Homomorphism problem is to test whether a given graph G called the guest graph allows a vertex-surjective homomorphism to some other given graph H called the host graph. The bijective and injective homomorphism problems can be formulated in terms of spanning subgraphs and subgraphs, and as such their computational complexity has been extensively studied. What about the surjective variant? Because this problem is NP-complete in general, we restrict the guest and the host graph to belong to graph classes \({{\mathcal G}}\) and \({{\mathcal H}}\), respectively. We determine to what extent a certain choice of \({{\mathcal G}}\) and \({{\mathcal H}}\) influences its computational complexity. We observe that the problem is polynomial-time solvable if \({{\mathcal H}}\) is the class of paths, whereas it is NP-complete if \({{\mathcal G}}\) is the class of paths. Moreover, we show that the problem is even NP-complete on many other elementary graph classes, namely linear forests, unions of complete graphs, cographs, proper interval graphs, split graphs and trees of pathwidth at most 2. In contrast, we prove that the problem is fixed-parameter tractable in k if \({{\mathcal G}}\) is the class of trees and \({{\mathcal H}}\) is the class of trees with at most k leaves, or if \({{\mathcal G}}\) and \({{\mathcal H}}\) are equal to the class of graphs with vertex cover number at most k.  相似文献   

9.
Disjunctive Temporal Problems (DTPs) with Preferences (DTPPs) extend DTPs with piece-wise constant preference functions associated to each constraint of the form lx ? yu, where x,y are (real or integer) variables, and l,u are numeric constants. The goal is to find an assignment to the variables of the problem that maximizes the sum of the preference values of satisfied DTP constraints, where such values are obtained by aggregating the preference functions of the satisfied constraints in it under a “max” semantic. The state-of-the-art approach in the field, implemented in the native DTPP solver Maxilitis, extends the approach of the native DTP solver Epilitis. In this paper we present alternative approaches that translate DTPPs to Maximum Satisfiability of a set of Boolean combination of constraints of the form l?x ? y?u, ? ∈{<,≤}, that extend previous work dealing with constant preference functions only. We prove correctness and completeness of the approaches. Results obtained with the Satisfiability Modulo Theories (SMT) solvers Yices and MathSAT on randomly generated DTPPs and DTPPs built from real-world benchmarks, show that one of our translation is competitive to, and can be faster than, Maxilitis (This is an extended and revised version of Bourguet et al. 2013).  相似文献   

10.
In this paper we consider both the maximization variant Max Rep and the minimization variant Min Rep of the famous Label Cover problem. So far the best approximation ratios known for these two problems were \(O(\sqrt{n})\) and indeed some authors suggested the possibility that this ratio is the best approximation factor for these two problems. We show, in fact, that there are a O(n 1/3)-approximation algorithm for Max Rep and a O(n 1/3log?2/3 n)-approximation algorithm for Min Rep. In addition, we also exhibit a randomized reduction from Densest k-Subgraph to Max Rep, showing that any approximation factor for Max Rep implies the same factor (up to a constant) for Densest k-Subgraph.  相似文献   

11.
There is an increasing interest in executing complex analyses over large graphs, many of which require processing a large number of multi-hop neighborhoods or subgraphs. Examples include ego network analysis, motif counting, finding social circles, personalized recommendations, link prediction, anomaly detection, analyzing influence cascades, and others. These tasks are not well served by existing vertex-centric graph processing frameworks, where user programs are only able to directly access the state of a single vertex at a time, resulting in high communication, scheduling, and memory overheads in executing such tasks. Further, most existing graph processing frameworks ignore the challenges in extracting the relevant portions of the graph that an analysis task is interested in, and loading those onto distributed memory. This paper introduces NScale, a novel end-to-end graph processing framework that enables the distributed execution of complex subgraph-centric analytics over large-scale graphs in the cloud. NScale enables users to write programs at the level of subgraphs rather than at the level of vertices. Unlike most previous graph processing frameworks, which apply the user program to the entire graph, NScale allows users to declaratively specify subgraphs of interest. Our framework includes a novel graph extraction and packing (GEP) module that utilizes a cost-based optimizer to partition and pack the subgraphs of interest into memory on as few machines as possible. The distributed execution engine then takes over and runs the user program in parallel on those subgraphs, restricting the scope of the execution appropriately, and utilizes novel techniques to minimize memory consumption by exploiting overlaps among the subgraphs. We present a comprehensive empirical evaluation comparing against three state-of-the-art systems, namely Giraph, GraphLab, and GraphX, on several real-world datasets and a variety of analysis tasks. Our experimental results show orders-of-magnitude improvements in performance and drastic reductions in the cost of analytics compared to vertex-centric approaches.  相似文献   

12.
Given a road network G = (V,E), where V (E) denotes the set of vertices(edges) in G, a set of points of interest P and a query point q residing in G, the reverse furthest neighbors (Rfn R ) query in road networks fetches a set of points pP that take q as their furthest neighbor compared with all points in P ∪ {q}. This is the monochromatic Rfn R (Mrfn R ) query. Another interesting version of Rfn R query is the bichromatic reverse furthest neighbor (Brfn R ) query. Given two sets of points P and Q, and a query point qQ, a Brfn R query fetches a set of points pP that take q as their furthest neighbor compared with all points in Q. This paper presents efficient algorithms for both Mrfn R and Brfn R queries, which utilize landmarks and partitioning-based techniques. Experiments on real datasets confirm the efficiency and scalability of proposed algorithms.  相似文献   

13.
In the Intervalizing Coloured Graphs problem, one must decide for a given graph G = (V, E) with a proper vertex colouring of G whether G is the subgraph of a properly coloured interval graph. For the case that the number of colors is fixed, we give an exact algorithm that uses \(2^{\mathcal {O}(n/\log n)}\) time. We also give an \(\mathcal {O}^{\ast }(2^{n})\) algorithm for the case that the number of colors is not fixed.  相似文献   

14.
We present two parameterized algorithms for the Minimum Fill-in problem, also known as Chordal Completion: given an arbitrary graph G and integer k, can we add at most k edges to G to obtain a chordal graph? Our first algorithm has running time \(\mathcal {O}(k^{2}nm+3.0793^{k})\), and requires polynomial space. This improves the base of the exponential part of the best known parameterized algorithm time for this problem so far. We are able to improve this running time even further, at the cost of more space. Our second algorithm has running time \(\mathcal {O}(k^{2}nm+2.35965^{k})\) and requires \(\mathcal {O}^{\ast}(1.7549^{k})\) space. To achieve these results, we present a new lemma describing the edges that can safely be added to achieve a chordal completion with the minimum number of edges, regardless of k.  相似文献   

15.
How can we discover interesting patterns from time-evolving high-speed data streams? How to analyze the data streams quickly and accurately, with little space overhead? How to guarantee the found patterns to be self-consistent? High-speed data stream has been receiving increasing attention due to its wide applications such as sensors, network traffic, social networks, etc. The most fundamental task on the data stream is frequent pattern mining; especially, focusing on recentness is important in real applications. In this paper, we develop two algorithms for discovering recently frequent patterns in data streams. First, we propose TwMinSwap to find top-k recently frequent items in data streams, which is a deterministic version of our motivating algorithm TwSample providing theoretical guarantees based on item sampling. TwMinSwap improves TwSample in terms of speed, accuracy, and memory usage. Both require only O(k) memory spaces and do not require any prior knowledge on the stream such as its length and the number of distinct items in the stream. Second, we propose TwMinSwap-Is to find top-k recently frequent itemsets in data streams. We especially focus on keeping self-consistency of the discovered itemsets, which is the most important property for reliable results, while using O(k) memory space with the assumption of a constant itemset size. Through extensive experiments, we demonstrate that TwMinSwap outperforms all competitors in terms of accuracy and memory usage, with fast running time. We also show that TwMinSwap-Is is more accurate than the competitor and discovers recently frequent itemsets with reasonably large sizes (at most 5–7) depending on datasets. Thanks to TwMinSwap and TwMinSwap-Is, we report interesting discoveries in real world data streams, including the difference of trends between the winner and the loser of U.S. presidential candidates, and temporal human contact patterns.  相似文献   

16.
In the List H- Homomorphism Problem, for a graph H that is a parameter of the problem, an instance consists of an undirected graph G with a list constraint \({L(v) \subseteq V(H)}\) for each variable \({v \in V(G)}\), and the objective is to determine whether there is a list H-homomorphism \({f:V(G) \to V(H)}\), that is, \({f(v) \in L(v)}\) for every \({v \in V(G)}\) and \({(f(u),f(v)) \in E(H)}\) whenever \({(u,v) \in E(G)}\).We consider the problem of testing list H-homomorphisms in the following weighted setting: An instance consists of an undirected graph G, list constraints L, weights imposed on the vertices of G, and a map \({f:V(G) \to V(H)}\) given as an oracle access. The objective is to determine whether f is a list H-homomorphism or far from any list H-homomorphism. The farness is measured by the total weight of vertices \({v \in V(G)}\) for which f(v) must be changed so as to make f a list H-homomorphism. In this paper, we classify graphs H with respect to the number of queries to f required to test the list H-homomorphisms. Specifically, we show that (i) list H-homomorphisms are testable with a constant number of queries if and only if H is a reflexive complete graph or an irreflexive complete bipartite graph and (ii) list H-homomorphisms are testable with a sublinear number of queries if and only if H is a bi-arc graph.  相似文献   

17.
18.
The k-truss of a graph is the largest edge-induced subgraph such that every edge is contained in at least k triangles within the subgraph, where a triangle is a cycle consisting of three vertices. As a new notion of cohesive subgraphs, truss has recently attracted a lot of research attentions in the database and data mining fields. At the same time, uncertainty is an intrinsic property of massive graph data, and truss decomposition (i.e., finding all k-trusses of a graph) has become a key primitive on uncertain graphs. In this paper, we study the truss decomposition problem on uncertain graphs, that is, finding all highly probable k-trusses of an uncertain graph. We first give an formal statement of the truss decomposition problem on uncertain graphs. Then, we prove that the truss decomposition of an uncertain graph attains two elegant properties, namely uniqueness and hierarchy. We show that the truss decomposition of an uncertain graph can be found in \(O(m^{1.5}Q)\) time by proposing an in-memory algorithm called \(\mathtt {TD_{mem}}\), where m is the number of edges of the uncertain graph, and Q is at most the maximum number of common neighbors of the endpoints of an edge. When an uncertain graph is too large to fit into main memory, we propose an external-memory algorithm \(\mathtt {TD_{I/O}}\) to find the truss decomposition of the uncertain graph. Extensive experiments have been carried out to evaluate the practical performance of the proposed algorithms. The experimental results verify that both \(\mathtt {TD_{mem}}\) and \(\mathtt {TD_{I/O}}\) are efficient when an uncertain graph is small enough to fit into main memory, and that \(\mathtt {TD_{I/O}}\) is much faster than \(\mathtt {TD_{mem}}\) when the graph is too large to fit into main memory.  相似文献   

19.
Given a large collection of co-evolving online activities, such as searches for the keywords “Xbox”, “PlayStation” and “Wii”, how can we find patterns and rules? Are these keywords related? If so, are they competing against each other? Can we forecast the volume of user activity for the coming month? We conjecture that online activities compete for user attention in the same way that species in an ecosystem compete for food. We present EcoWeb, (i.e., Ecosystem on the Web), which is an intuitive model designed as a non-linear dynamical system for mining large-scale co-evolving online activities. Our second contribution is a novel, parameter-free, and scalable fitting algorithm, EcoWeb-Fit, that estimates the parameters of EcoWeb. Extensive experiments on real data show that EcoWeb is effective, in that it can capture long-range dynamics and meaningful patterns such as seasonalities, and practical, in that it can provide accurate long-range forecasts. EcoWeb consistently outperforms existing methods in terms of both accuracy and execution speed.  相似文献   

20.
Aggregate similarity search, also known as aggregate nearest-neighbor (Ann) query, finds many useful applications in spatial and multimedia databases. Given a group Q of M query objects, it retrieves from a database the objects most similar to Q, where the similarity is an aggregation (e.g., \({{\mathrm{sum}}}\), \(\max \)) of the distances between each retrieved object p and all the objects in Q. In this paper, we propose an added flexibility to the query definition, where the similarity is an aggregation over the distances between p and any subset of \(\phi M\) objects in Q for some support \(0< \phi \le 1\). We call this new definition flexible aggregate similarity search and accordingly refer to a query as a flexible aggregate nearest-neighbor ( Fann ) query. We present algorithms for answering Fann queries exactly and approximately. Our approximation algorithms are especially appealing, which are simple, highly efficient, and work well in both low and high dimensions. They also return near-optimal answers with guaranteed constant-factor approximations in any dimensions. Extensive experiments on large real and synthetic datasets from 2 to 74 dimensions have demonstrated their superior efficiency and high quality.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号