首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The widespread use of graph-based models for representing data collections (e.g. object-oriented data, XML data, etc.) has stimulated the database research community to investigate the problem of defining declarative languages for querying graph-like databases. In this paper, a new framework for querying graph-like data based on graph grammars is proposed. The new paradigm allows us to verify structural properties of graphs and to extract sub-graphs. More specifically, a new form of query (namely graph query) is proposed, consisting in a particular graph grammar which defines a class of graphs to be matched on the graph representing the database. Thus, differently from path queries, the answer of a graph query is not just a set of nodes, but a subgraph, extracted from the input graph, which satisfies the structural properties defined by the graph grammar. Expressiveness and complexity of different forms of graph queries are discussed, and some practical applications are shown.  相似文献   

2.
The growing popularity of graph databases has generated interesting data management problems, such as subgraph search, shortest path query, reachability verification, and pattern matching. Among these, a pattern match query is more flexible compared with a subgraph search and more informative compared with a shortest path or a reachability query. In this paper, we address distance-based pattern match queries over a large data graph G. Due to the huge search space, we adopt a filter-and-refine framework to answer a pattern match query over a large graph. We first find a set of candidate matches by a graph embedding technique and then evaluate these to find the exact matches. Extensive experiments confirm the superiority of our method.  相似文献   

3.
Subgraph querying has wide applications in various fields such as cheminformatics and bioinformatics. Given a query graph, q, a subgraph-querying algorithm retrieves all graphs, D(q), which have q as a subgraph, from a graph database, D. Subgraph querying is costly because it uses subgraph isomorphism tests, which are NP-complete. Graph indices are commonly used to improve the performance of subgraph querying in graph databases. Subgraph-querying algorithms first construct a candidate answer set by filtering out a set of false answers and then verify each candidate graph using subgraph isomorphism tests. To build graph indices, various kinds of substructure (subgraph, subtree, or path) features have been proposed with the goal of maximizing the filtering rate. Each of them works with a specifically designed index structure, for example, discriminative and frequent subgraph features work with gIndex, δ-TCFG features work with FG-index, etc. We propose Lindex, a graph index, which indexes subgraphs contained in database graphs. Nodes in Lindex represent key-value pairs where the key is a subgraph in a database and the value is a list of database graphs containing the key. We propose two heuristics that are used in the construction of Lindex that allows us to determine answers to subgraph queries conducting less subgraph isomorphism tests. Consequently, Lindex improves subgraph-querying efficiency. In addition, Lindex is compatible with any choice of features. Empirically, we demonstrate that Lindex used in conjunction with subgraph indexing features proposed in previous works outperforms other specifically designed index structures. As a novel index structure, Lindex (1) is effective in filtering false graphs (2) provides fast index lookups, (3) is fast with respect to index construction and maintenance, and (4) can be constructed using any set of substructure index features. These four properties result in a fast and scalable subgraph-querying infrastructure. We substantiate the benefits of Lindex and its disk-resident variation Lindex+ theoretically and empirically.  相似文献   

4.
Flesca  Sergio  Furfaro  Filippo  Greco  Sergio 《World Wide Web》2002,5(2):125-157
In this paper we present a graphical query language for XML. The language, based on a simple form of graph grammars, permits us to extract data and reorganize information in a new structure. As with most of the current query languages for XML, queries consist of two parts: one extracting a subgraph and one constructing the output graph. The semantics of queries is given in terms of graph grammars. The use of graph grammars makes it possible to define, in a simple way, the structural properties of both the subgraph that has to be extracted and the graph that has to be constructed. We provide an example-driven comparison of our language w.r.t. other XML query languages, and show the effectiveness and simplicity of our approach.  相似文献   

5.
Output-Sensitive Reporting of Disjoint Paths   总被引:1,自引:0,他引:1  
A k -path query on a graph consists of computing k vertex-disjoint paths between two given vertices of the graph, whenever they exist. In this paper we study the problem of performing k -path queries, with , in a graph G with n vertices. We denote with the total length of the reported paths. For , we present an optimal data structure for G that uses O(n) space and executes k -path queries in output-sensitive time. For triconnected planar graphs, our results make use of a new combinatorial structure that plays the same role as bipolar (st ) orientations for biconnected planar graphs. This combinatorial structure also yields an alternative construction of convex grid drawings of triconnected planar graphs. Received August 24, 1996; revised April 8, 1997.  相似文献   

6.
We present a general technique for dynamizing a class of problems whose underlying structure is a computation graph embedded in a tree. We introduce three fully dynamic data structures, called path attribute systems, tree attribute systems, and linear attribute grammars, which extend and generalize the dynamic trees of Sleator and Tarjan. More specifically, we associate values, called attributes, with the nodes and paths of a rooted tree. Path attributes form a path attribute system if they can be maintained in constant time under path concatenation. Node attributes form a tree attribute system if the tree attributes of the tail of a path Π can be determined in constant time from the path attributes of Π. A linear attribute grammar is a tree-based linear expression such that the values of a node μ are calculated from the values at the parent, siblings, and/for children of μ. We provide a framework for maintaining path attribute systems, tree attribute systems, and linear attribute grammars in a fully dynamic environment using linear space and logarithmic time per operation. Also, we demonstrate the applicability of our techniques by showing examples of graph and geometric problems that can be efficiently dynamized, including biconnectivity and triconnectivity queries, planarity testing, drawing trees and series-parallel digraphs, slicing floorplan compaction, point location, and many optimization problems on bounded tree-width graphs. Received May 13, 1994; revised October 12, 1995.  相似文献   

7.
In this paper, we study a variant of reachability queries, called label-constraint reachability (LCR) queries. Specifically, given a label set S and two vertices u1 and u2 in a large directed graph G, we check the existence of a directed path from u1 to u2, where edge labels along the path are a subset of S. We propose the path-label transitive closure method to answer LCR queries. Specifically, we t4ransform an edge-labeled directed graph into an augmented DAG by replacing the maximal strongly connected components as bipartite graphs. We also propose a Dijkstra-like algorithm to compute path-label transitive closure by re-defining the “distance” of a path. Comparing with the existing solutions, we prove that our method is optimal in terms of the search space. Furthermore, we propose a simple yet effective partition-based framework (local path-label transitive closure+online traversal) to answer LCR queries in large graphs. We prove that finding the optimal graph partition to minimize query processing cost is a NP-hard problem. Therefore, we propose a sampling-based solution to find the sub-optimal partition. Moreover, we address the index maintenance issues to answer LCR queries over the dynamic graphs. Extensive experiments confirm the superiority of our method.  相似文献   

8.
We propose a new way of indexing a large database of small and medium-sized graphs and processing exact subgraph matching (or subgraph isomorphism) and approximate (full) graph matching queries. Rather than decomposing a graph into smaller units (e.g., paths, trees, graphs) for indexing purposes, we represent each graph in the database by its graph signature, which is essentially a multiset. We construct a disk-based index on all the signatures via bulk loading. During query processing, a query graph is also mapped into its signature, and this signature is searched using the index by performing multiset operations. To improve the precision of exact subgraph matching, we develop a new scheme using the concept of line graphs. Through extensive evaluation on real and synthetic graph datasets, we demonstrate that our approach provides a scalable and efficient disk-based solution for a large database of small and medium-sized graphs.  相似文献   

9.
We consider graphs whose vertices may be in one of two different states: either on or off . We wish to maintain dynamically such graphs under an intermixed sequence of updates and queries. An update may reverse the status of a vertex, by switching it either on or off , and may insert a new edge or delete an existing edge. A query tests whether any two given vertices are connected in the subgraph induced by the vertices that are on . We give efficient algorithms that maintain information about connectivity on planar graphs in O( log 3 n) amortized time per query, insert, delete, switch-on, and switch-off operation over sequences of at least Ω(n) operations, where n is the number of vertices of the graph. Received September 1997; revised January 1999.  相似文献   

10.
Navigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In this paper, we propose the query language nSPARQL that uses nested regular expressions to navigate RDF data. We study some of the fundamental properties of nSPARQL and nested regular expressions concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary in nSPARQL to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that given an RDF graph G and a nested regular expression E, this problem can be solved in time O(|G||E|).  相似文献   

11.
We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a “partial evaluation and assembly” framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q over RDF graph G. Based on properties of subgraph matching over a distributed graph, we introduce local partial match as partial answers in each fragment of RDF graph G. For assembly, we propose two methods: centralized and distributed assembly. We analyze our algorithms from both theoretically and experimentally. Extensive experiments over both real and benchmark RDF repositories of billions of triples confirm that our method is superior to the state-of-the-art methods in both the system’s performance and scalability.  相似文献   

12.
With ever growing databases containing multimedia data, indexing has become a necessity to avoid a linear search. We propose a novel technique for indexing multimedia databases in which entries can be represented as graph structures. In our method, the topological structure of a graph as well as that of its subgraphs are represented as vectors whose components correspond to the sorted laplacian eigenvalues of the graph or subgraphs. Given the laplacian spectrum of graph G, we draw from recently developed techniques in the field of spectral integral variation to generate the laplacian spectrum of graph G+e without computing its eigendecomposition, where G+e is a graph obtained by adding edge e to graph G. This process improves the performance of the system for generating the subgraph signatures for 1.8% and 6.5% in datasets of size 420 and 1400, respectively. By doing a nearest neighbor search around the query spectra, similar but not necessarily isomorphic graphs are retrieved. Given a query graph, a voting schema ranks database graphs into an indexing hypothesis to which a final matching process can be applied. The novelties of the proposed method come from the powerful representation of the graph topology and successfully adopting the concept of spectral integral variation in an indexing algorithm. To examine the fitness of the new indexing framework, we have performed a number of experiments using an extensive set of recognition trials in the domain of 2D and 3D object recognition. The experiments, including a comparison with a competing indexing method using two different graph-based object representations, demonstrate both the robustness and efficacy of the overall approach.  相似文献   

13.
We consider the problem of exploring an anonymous undirected graph using an oblivious robot. The studied exploration strategies are designed so that the next edge in the robot’s walk is chosen using only local information, and so that some local equity (fairness) criterion is satisfied for the adjacent undirected edges. Such strategies can be seen as an attempt to derandomize random walks, and are natural counterparts for undirected graphs of the rotor-router model for symmetric directed graphs. The first of the studied strategies, known as Oldest-First, always chooses the neighboring edge for which the most time has elapsed since its last traversal. Unlike in the case of symmetric directed graphs, we show that such a strategy in some cases leads to exponential cover time. We then consider another strategy called Least-Used-First which always uses adjacent edges which have been traversed the smallest number of times. We show that any Least-Used-First exploration covers a graph G = (V, E) of diameter D within time O(D|E|), and in the long run traverses all edges of G with the same frequency.  相似文献   

14.
基于最小生成树的图数据库索引算法   总被引:1,自引:0,他引:1  
李楠  高宏  李建中 《软件学报》2009,20(Z1):144-153
对复杂数据进行图模式建模近几年越来越流行,因此,在查询执行的优化过程中图索引技术变得至关重要.研究了图模式的索引问题,并且提出了一种近似的索引方法,称为MSTA方法.MSTA方法利用最小生成树结构作为索引特征,依据最小生成树边序列的包含关系和基于最大公共子图的图距离度量,将最小生成树组织到一个称为MST树的索引结构中.MST树索引结构可以高效地支持多种查询,例如子图查询.MSTA方法具备高效的索引性能.在索引大小和索引建立时间方面,传统方法是MSTA方法的数十倍,甚至上百倍.MSTA方法虽然不能返回完整结果,但是可以返回经图距离度量排序最好的部分结果.  相似文献   

15.
Summary Node label controlled (NLC) grammars are graph grammars (operating on node labeled undirected graphs) which rewrite single nodes only and establish connections between the embedded graph and the neighbors of the rewritten node on the basis of the labels of the involved nodes only. They define (possibly infinite) languages of undirected node labeled graphs (or, if we just omit the labels, languages of unlabeled graphs). Boundary NLC (BNLC) grammars are NLC grammars with the property that whenever — in a graph already generated — two nodes may be rewritten, then these nodes are not adjacent. The graph languages generated by this type of grammars are called BNLC languages. The present paper continues the investigations of basic properties of BNLC grammars and languages where the central question is the following: If L is a BNLC language and P is a graph theoretic property, is the set of all graphs from L satisfying P again a BNLC language? We demonstrate that the class of BNLC languages is very stable in the sense that for almost all properties we consider the resulting languages are BNLC. In particular, the above question gets an affirmative answer, if the property P is: being k-colorable, being connected, having a subgraph homeomorphic to a given graph, and being nonplanar.This research was carried out during the second author's stay at the Rijksuniversiteit Leiden, The Netherlands  相似文献   

16.
Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.  相似文献   

17.
RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for RDF, SPARQL allows querying RDF through graph patterns, i.e., RDF graphs involving variables. Other languages, inspired by the work in databases, use regular expressions for searching paths in RDF graphs. Each approach can express queries that are out of reach of the other one. Hence, we aim at combining these two approaches. For that purpose, we define a language, called PRDF (for “Path RDF”) which extends RDF such that the arcs of a graph can be labeled by regular expression patterns. We provide PRDF with a semantics extending that of RDF, and propose a correct and complete algorithm which, by computing a particular graph homomorphism, decides the consequence between an RDF graph and a PRDF graph. We then define the PSPARQL query language, extending SPARQL with PRDF graph patterns and complying with RDF model theoretic semantics. PRDF thus offers both graph patterns and path expressions. We show that this extension does not increase the computational complexity of SPARQL and, based on the proposed algorithm, we have implemented a correct and complete PSPARQL query engine.  相似文献   

18.
In many applications, XML documents need to be modelled as graphs. The query processing of graph-structured XML documents brings new challenges. In this paper, we design a method based on labelling scheme for structural queries processing on graph-structured XML documents. We give each node some labels, the reachability labelling scheme. By extending an interval-based reachability labelling scheme for DAG by Rakesh et al., we design labelling schemes to support the judgements of reachability relationships for general graphs. Based on the labelling schemes, we design graph structural join algorithms to answer the structural queries with only ancestor-descendant relationship efficiently. For the processing of subgraph query, we design a subgraph join algorithm. With efficient data structure, the subgraph join algorithm can process subgraph queries with various structures efficiently. Experimental results show that our algorithms have good performance and scalability. Support by the Key Program of the National Natural Science Foundation of China under Grant No.60533110; the National Grand Fundamental Research 973 Program of China under Grant No. 2006CB303000; the National Natural Science Foundation of China under Grant No. 60773068 and No. 60773063.  相似文献   

19.
20.
For an arbitrary filled graph G+ of a given original graph G, we consider the problem of removing fill edges from G+ in order to obtain a graph M that is both a minimal filled graph of G and a subgraph of G+. For G+ with f fill edges and e original edges, we give a simple O(f(e+f)) algorithm which solves the problem and computes a corresponding minimal elimination ordering of G. We report on experiments with an implementation of our algorithm, where we test graphs G corresponding to some real sparse matrix applications and apply well-known and widely used ordering heuristics to find G+. Our findings show the amount of fill that is commonly removed by a minimalization for each of these heuristics, and also indicate that the runtime of our algorithm on these practical graphs is better than the presented worst-case bound.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号