期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fast graph query processing with a low-cost index

James Cheng Yiping Ke Ada Wai-Chee Fu Jeffrey Xu Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2011,20(4):521-539

This paper studies the problem of processing supergraph queries, that is, given a database containing a set of graphs, find all the graphs in the database of which the query graph is a supergraph. Existing works usually construct an index and performs a filtering-and-verification process, which still requires many subgraph isomorphism testings. There are also significant overheads in both index construction and maintenance. In this paper, we design a graph querying system that achieves both fast indexing and efficient query processing. The index is constructed by a simple but fast method of extracting the commonality among the graphs, which does not involve any costly operation such as graph mining. Our query processing has two key techniques, direct inclusion and filtering. Direct inclusion allows partial query answers to be included directly without candidate verification. Our filtering technique further reduces the candidate set by operating on a much smaller projected database. Experimental results show that our method is significantly more efficient than the existing works in both indexing and query processing, and our index has a low maintenance cost. 相似文献

2.

Lindex: a lattice-based index for graph databases

Dayu Yuan Prasenjit Mitra 《The VLDB Journal The International Journal on Very Large Data Bases》2013,22(2):229-252

Subgraph querying has wide applications in various fields such as cheminformatics and bioinformatics. Given a query graph, q, a subgraph-querying algorithm retrieves all graphs, D(q), which have q as a subgraph, from a graph database, D. Subgraph querying is costly because it uses subgraph isomorphism tests, which are NP-complete. Graph indices are commonly used to improve the performance of subgraph querying in graph databases. Subgraph-querying algorithms first construct a candidate answer set by filtering out a set of false answers and then verify each candidate graph using subgraph isomorphism tests. To build graph indices, various kinds of substructure (subgraph, subtree, or path) features have been proposed with the goal of maximizing the filtering rate. Each of them works with a specifically designed index structure, for example, discriminative and frequent subgraph features work with gIndex, δ-TCFG features work with FG-index, etc. We propose Lindex, a graph index, which indexes subgraphs contained in database graphs. Nodes in Lindex represent key-value pairs where the key is a subgraph in a database and the value is a list of database graphs containing the key. We propose two heuristics that are used in the construction of Lindex that allows us to determine answers to subgraph queries conducting less subgraph isomorphism tests. Consequently, Lindex improves subgraph-querying efficiency. In addition, Lindex is compatible with any choice of features. Empirically, we demonstrate that Lindex used in conjunction with subgraph indexing features proposed in previous works outperforms other specifically designed index structures. As a novel index structure, Lindex (1) is effective in filtering false graphs (2) provides fast index lookups, (3) is fast with respect to index construction and maintenance, and (4) can be constructed using any set of substructure index features. These four properties result in a fast and scalable subgraph-querying infrastructure. We substantiate the benefits of Lindex and its disk-resident variation Lindex+ theoretically and empirically. 相似文献

3.

QUBLE: towards blending interactive visual subgraph search queries on large networks

Ho Hoang Hung Sourav S Bhowmick Ba Quan Truong Byron Choi Shuigeng Zhou 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(3):401-426

In a previous paper, we laid out the vision of a novel graph query processing paradigm where instead of processing a visual query graph after its construction, it interleaves visual query formulation and processing by exploiting the latency offered by the gui to filter irrelevant matches and prefetch partial query results [8]. Our recent attempts at implementing this vision [8, 9] show significant improvement in system response time (srt) for subgraph queries. However, these efforts are designed specifically for graph databases containing a large collection of small or medium-sized graphs. In this paper, we propose a novel algorithm called quble (QUery Blender for Large nEtworks) to realize this visual subgraph querying paradigm on very large networks (e.g., protein interaction networks, social networks). First, it decomposes a large network into a set of graphlets and supergraphlets using a minimum cut-based graph partitioning technique. Next, it mines approximate frequent and small infrequent fragments (sifs) from them and identifies their occurrences in these graphlets and supergraphlets. Then, the indexing framework of [9] is enhanced so that the mined fragments can be exploited to index graphlets for efficient blending of visual subgraph query formulation and query processing. Extensive experiments on large networks demonstrate effectiveness of quble. 相似文献

4.

Using partial evaluation in holistic subgraph search

Peng PENG Lei ZOU Zhenqin DU Dongyan ZHAO 《Frontiers of Computer Science》2018,12(5):966-983

Because of its wide application, the subgraph matching problem has been studied extensively during the past decade. However, most existing solutions assume that a data graph is a vertex/edge-labeled graph (i.e., each vertex/edge has a simple label). These solutions build structural indices by considering the vertex labels. However, some real graphs contain rich-content vertices such as user profiles in social networks and HTML pages on the World Wide Web. In this study, we consider the problem of subgraph matching using a more general scenario. We build a structural index that does not depend on any vertex content. Based on the index, we design a holistic subgraph matching algorithm that considers the query graph as a whole and finds one match at a time. In order to further improve efficiency, we propose a “partial evaluation and assembly” framework to find subgraph matches over large graphs. Last but not least, our index has light maintenance overhead. Therefore, our method can work well on dynamic graphs. Extensive experiments on real graphs show that our method outperforms the state-of-the-art algorithms. 相似文献

5.

基于RDF图结构切分的高效子图匹配方法

关皓元朱斌李冠宇赵玲《计算机应用》2018,38(7):1898-1904

针对在SPARQL查询处理中,随着查询图结构逐渐复杂而导致基于图的查询效率愈发低下的问题,通过分析几种资源描述框架（RDF）图的基本结构,提出了一种基于查询图结构切分的子图匹配方法——RSM。首先,将查询图切分为若干结构简单的查询子图,并通过相邻谓词结构索引来定义查询图节点的搜索空间;然后,通过相邻子图结构来缩小搜索空间范围,在数据图中根据搜索空间中的搜索范围找到符合的子图结构;最后,将得到的子图进行连接并作为查询结果输出。将RSM与RDF-3X、R3F、GraSS等主流查询方法作比较,对比了各方法在不同数据集上对于复杂程度不同的查询图的查询响应时间。实验结果充分表明,与其他3种方法相比,在处理结构复杂的查询图时,RSM的查询响应时间更短,具有更高的查询效率。相似文献

6.

Kronecker product graph matching

B.J. van WykAuthor Vitae M.A. van Wyk^{Author Vitae} 《Pattern recognition》2003,36(9):2019-2030

In this paper the Interpolator-based Kronecker product graph matching (IBKPGM) algorithm for performing attributed graph matching is presented. The IBKPGM algorithm is based on the Kronecker product graph matching (KPGM) formulation. This new formulation incorporates a general approach to a wide class of graph matching problems based on attributed graphs, allowing the structure of the graphs to be based on multiple sets of attributes. Salient features of the IBKPGM algorithm are that no assumption is made about the adjacency structure of the graphs to be matched, and that the explicit calculation of compatibility values between all vertices of the reference and input graphs as well as between all edges of the reference and input graphs are avoided. 相似文献

7.

基于最小生成树的图数据库索引算法 总被引：1，自引：0，他引：1

李楠高宏李建中《软件学报》2009,20(Z1):144-153

对复杂数据进行图模式建模近几年越来越流行,因此,在查询执行的优化过程中图索引技术变得至关重要.研究了图模式的索引问题,并且提出了一种近似的索引方法,称为MSTA方法.MSTA方法利用最小生成树结构作为索引特征,依据最小生成树边序列的包含关系和基于最大公共子图的图距离度量,将最小生成树组织到一个称为MST树的索引结构中.MST树索引结构可以高效地支持多种查询,例如子图查询.MSTA方法具备高效的索引性能.在索引大小和索引建立时间方面,传统方法是MSTA方法的数十倍,甚至上百倍.MSTA方法虽然不能返回完整结果,但是可以返回经图距离度量排序最好的部分结果. 相似文献

8.

Structure and attribute index for approximate graph matching in large graphs 总被引：2，自引：0，他引：2

Linhong Zhu Wee Keong Ng James Cheng 《Information Systems》2011

The increasing popularity of graph data in various domains has lead to a renewed interest in developing efficient graph matching techniques, especially for processing large graphs. In this paper, we study the problem of approximate graph matching in a large attributed graph. Given a large attributed graph and a query graph, we compute a subgraph of the large graph that best matches the query graph. We propose a novel structure-aware and attribute-aware index to process approximate graph matching in a large attributed graph. We first construct an index on the similarity of the attributed graph, by partitioning the large search space into smaller subgraphs based on structure similarity and attribute similarity. Then, we construct a connectivity-based index to give a concise representation of inter-partition connections. We use the index to find a set of best matching paths. From these best matching paths, we compute the best matching answer graph using a greedy algorithm. Experimental results on real datasets demonstrate the efficiency of both index construction and query processing. We also show that our approach attains high-quality query answers. 相似文献

9.

Indexing through laplacian spectra

M. Fatih Demirci Reinier H. van Leuken Remco C. Veltkamp 《Computer Vision and Image Understanding》2008,110(3):312

With ever growing databases containing multimedia data, indexing has become a necessity to avoid a linear search. We propose a novel technique for indexing multimedia databases in which entries can be represented as graph structures. In our method, the topological structure of a graph as well as that of its subgraphs are represented as vectors whose components correspond to the sorted laplacian eigenvalues of the graph or subgraphs. Given the laplacian spectrum of graph G, we draw from recently developed techniques in the field of spectral integral variation to generate the laplacian spectrum of graph G+e without computing its eigendecomposition, where G+e is a graph obtained by adding edge e to graph G. This process improves the performance of the system for generating the subgraph signatures for 1.8% and 6.5% in datasets of size 420 and 1400, respectively. By doing a nearest neighbor search around the query spectra, similar but not necessarily isomorphic graphs are retrieved. Given a query graph, a voting schema ranks database graphs into an indexing hypothesis to which a final matching process can be applied. The novelties of the proposed method come from the powerful representation of the graph topology and successfully adopting the concept of spectral integral variation in an indexing algorithm. To examine the fitness of the new indexing framework, we have performed a number of experiments using an extensive set of recognition trials in the domain of 2D and 3D object recognition. The experiments, including a comparison with a competing indexing method using two different graph-based object representations, demonstrate both the robustness and efficacy of the overall approach. 相似文献

10.

Answering exact distance queries on real-world graphs with bounded performance guarantees

Yang Xiang 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(5):677-695

The ability to efficiently obtain exact distance information from both directed and undirected graphs is desired by many real-world applications. In this work, we unified the query indexing efforts on directed and undirected graphs into one by proposing the TreeMap approach. Our approach has very tight bounds on query time, index size, and construction time for answering queries on both directed and undirected graphs. The query time complexity is close to constant for graphs with a small width of tree decomposition, and the index construction can be completed without materializing the distance matrix or other high-cost operations. In the empirical study, we demonstrated that the TreeMap approach in general performs much better than competitive methods in indexing real graphs for answering exact distance queries. 相似文献

11.

一种新的频繁子图挖掘算法

下载免费PDF全文

敦景峰张伟柴然《计算机工程》2011,37(20):27-29

传统Aprior频繁子图挖掘算法中存在大量冗余子图.针对该问题,提出一种新的频繁子图挖掘算法(GAI).介绍一种三层MADI索引结构,用于存储图集的信息,以减少图集的扫描次数,通过扩展ETree树构造频繁子图,并用表来存储候选子图,避免扩展过程中冗余图的产生以及对整个数据库的扫描,从而简化支持度的计算,提高图/子图同构... 相似文献

12.

Efficient subsequence matching over large video databases

Xiangmin Zhou Xiaofang Zhou Lei Chen Athman Bouguettaya 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(4):489-508

Video similarity matching has broad applications such as copyright detection, news tracking and commercial monitoring, etc. Among these applications, one typical task is to detect the local similarity between two videos without the knowledge on positions and lengths of each matched subclip pair. However, most studies so far on video detection investigate the global similarity between two short clips using a pre-defined distance function. Although there are a few works on video subsequence detection, all these proposals fail to provide an effective query processing mechanism. In this paper, we first generalize the problem of video similarity matching. Then, a novel solution called consistent keyframe matching (CKM) is proposed to solve the problem of subsequence matching based on video segmentation. CKM is designed with two goals: (1) good scalability in terms of the query sequence length and the size of video database and (2) fast video subsequence matching in terms of processing time. Good scalability is achieved by employing a batch query paradigm, where keyframes sharing the same query space are summarized and ordered. As such, the redundancy of data access is eliminated, leading to much faster video query processing. Fast subsequence matching is achieved by comparing the keyframes of different video sequences. Specifically, a keyframe matching graph is first constructed and then divided into matched candidate subgraphs. We have evaluated our proposed approach over a very large real video database. Extensive experiments demonstrate the effectiveness and efficiency of our approach. 相似文献

13.

Graph indexing for spatial data traversal in road map databases

《Computers & Operations Research》2001,28(3):223-241

This paper proposes a graph indexing technique for processing constrained spatial queries and discusses the application of such a technique to road map databases where the graph topology is relatively stationary. The fundamental idea of our technique is to augment the original graph with selected augmented links so that query processing cost, especially I/O cost, is minimized. Based on the computational results derived from the probabilistic analysis, we found that the proposed graph indexing technique is a promising approach for significantly reducing costs of spatial queries.Scope and purposeSpatial data is found in geographic information systems where data attributes are associated with nodes and links in directed graphs. Queries on spatial data are generally expensive because of the recursive nature of spatial data traversal. We propose a graph indexing technique to expedite queries on spatial data. The graph index is an instrument for early identification of the relevant nodes and links to the query so that repeated accesses to the same data pages can be eliminated. This paper presents the graph indexing technique in the context of road map databases and shows that the graph indexing technique can improve significantly on the efficiency of constrained queries on spatial data. 相似文献

14.

Graph Grammars for Querying Graph-like Data

S. Flesca F. Furfaro S. Greco 《Electronic Notes in Theoretical Computer Science》2001,50(3)

Recently research has deeply investigated the problem of querying semi-structured data and data which can be represented by means of graphs (e.g. object-oriented data, XML data, etc.). Typically queries on graph-like data, called path queries, are expressed by means of regular expressions denoting paths in the graph. The result of a path query is the set of nodes reachable by means of a path expressed by a specified regular expression. In this paper we investigate the problem of extracting a subgraph satisfying a given property from a given graph representing some information. We propose a new form of queries, called graph queries, whose answers are (marked) graphs having a particular structure, extracted from the source graph. We show that a simple form of graph grammars can be profitably used to define graph queries. The result of a graph query, using a grammar G over a database D, is a marked subgraph of D ‘matching’ a graph derived from G. We consider different types of graph grammars which can be used to query graph-like data and consider their expressiveness and complexity. 相似文献

15.

An Efficient Framework for Multiple Subgraph Pattern Matching Models

下载免费PDF全文

Gao Jiu-Ru Chen Wei Xu Jia-Jie Liu An Li Zhi-Xu Yin Hongzhi Zhao Lei 《计算机科学技术学报》2019,34(6):1185-1202

With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphism, which is an NP-complete problem and sometimes too strict to find useful matches in certain applications. And how to protect the privacy of data graphs in subgraph pattern matching without undermining matching results is an important concern. Thus, we propose a novel framework to achieve the privacy-preserving subgraph pattern matching in cloud. In order to protect the structural privacy in data graphs, we firstly develop a k-automorphism model based method. Additionally, we use a cost-model based label generalization method to protect label privacy in both data graphs and pattern graphs. During the generation of the k-automorphic graph, a large number of noise edges or vertices might be introduced to the original data graph. Thus, we use the outsourced graph, which is only a subset of a k-automorphic graph, to answer the subgraph pattern matching. The efficiency of the pattern matching process can be greatly improved in this way. Extensive experiments on real-world datasets demonstrate the high efficiency of our framework.

相似文献

16.

Blockchain Based Consensus Algorithm and Trustworthy Evaluation of Authenticated Subgraph Queries

G. Sharmila M. K. Kavitha Devi 《计算机系统科学与工程》2023,45(2):1743-1758

Over the past era, subgraph mining from a large collection of graph database is a crucial problem. In addition, scalability is another big problem due to insufficient storage. There are several security challenges associated with subgraph mining in today’s on-demand system. To address this downside, our proposed work introduces a Blockchain-based Consensus algorithm for Authenticated query search in the Large-Scale Dynamic Graphs (BCCA-LSDG). The two-fold process is handled in the proposed BCCA-LSDG: graph indexing and authenticated query search (query processing). A blockchain-based reputation system is meant to maintain the trust blockchain and cloud server of the proposed architecture. To resolve the issues and provide safe big data transmission, the proposed technique also combines blockchain with a consensus algorithm architecture. Security of the big data is ensured by dividing the BC network into distinct networks, each with a restricted number of allowed entities, data kept in the cloud gate server, and data analysis in the blockchain. The consensus algorithm is crucial for maintaining the speed, performance and security of the blockchain. Then Dual Similarity based MapReduce helps in mapping and reducing the relevant subgraphs with the use of optimal feature sets. Finally, the graph index refinement process is undertaken to improve the query results. Concerning query error, fuzzy logic is used to refine the index of the graph dynamically. The proposed technique outperforms advanced methodologies in both blockchain and non-blockchain systems, and the combination of blockchain and subgraph provides a secure communication platform, according to the findings. 相似文献

17.

Efficient Relational Techniques for Processing Graph Queries

下载免费PDF全文

Sherif Sakr Ghazi Al-Naymat 《计算机科学技术学报》2010,25(6):1237-1255

Graphs are widely used for modeling complicated data such as social networks, chemical compounds, protein interactions and semantic web. To effectively understand and utilize any collection of graphs, a graph database that efficiently supports elementary querying mechanisms is crucially required. For example, Subgraph and Supergraph queries are important types of graph queries which have many applications in practice. A primary challenge in computing the answers of graph queries is that pair-wise comparisons of graphs are usually hard problems. Relational database management systems (RDBMSs) have repeatedly been shown to be able to efficiently host different types of data such as complex objects and XML data. RDBMSs derive much of their performance from sophisticated optimizer components which make use of physical properties that are specific to the relational model such as sortedness, proper join ordering and powerful indexing mechanisms. In this article, we study the problem of indexing and querying graph databases using the relational infrastructure. We present a purely relational framework for processing graph queries. This framework relies on building a layer of graph features knowledge which capture metadata and summary features of the underlying graph database. We describe different querying mechanisms which make use of the layer of graph features knowledge to achieve scalable performance for processing graph queries. Finally, we conduct an extensive set of experiments on real and synthetic datasets to demonstrate the efficiency and the scalability of our techniques. 相似文献

18.

Aggregate nearest neighbor queries in uncertain graphs

Zhang Liu Chaokun Wang Jianmin Wang 《World Wide Web》2014,17(1):161-188

Most recently, uncertain graph data begin attracting significant interests of database research community, because uncertainty is the intrinsic property of the real-world and data are more suitable to be modeled as graphs in numbers of applications, e.g. social network analysis, PPI networks in biology, and road network monitoring. Meanwhile, as one of the basic query operators, aggregate nearest neighbor (ANN) query retrieves a data entity whose aggregate distance, e.g. sum, max, to the given query data entities is smaller than those of other data entities in a database. ANN query on both certain graph data and high dimensional data has been well studied by previous work. However, existing ANN query processing approaches cannot handle the situation of uncertain graphs, because topological structures of an uncertain graph may vary in different possible worlds. Motivated by this, we propose the aggregate nearest neighbor query in uncertain graphs (UG-ANN) in this paper. First of all, we give the formal definition of UG-ANN query and the basic UG-ANN query algorithm. After that, to improve the efficiency of UG-ANN query processing, we develop two kinds of pruning approaches, i.e. structural pruning and instance pruning. The structural pruning takes advantages the monotonicity of the aggregate distance to derive the upper and lower bounds of the aggregate distance for reducing the graph size. Whereas, the instance pruning decreases the number of possible worlds to be checked in the searching tree. Comprehensive experimental results on real-world data sets demonstrate that the proposed method significantly improves the efficiency of the UG-ANN query processing. 相似文献

19.

层次序列索引的大规模动态标签图子图查询

任成林姜丽雁单晓欢宋宝燕《计算机工程与应用》2019,55(1):70-75

标签图常用于智能交通网、生物信息网等新兴领域的建模。子图查询作为图数据分析的关键问题，引起了研究者的广泛关注。对现有子图查询算法的研究发现，随着图数据规模增大且频繁更新，传统子图查询算法普遍存在查询效率低，存储开销大，忽略顶点标签信息等问题。为此，提出了一种支持大规模动态标签图子图查询的层次序列索引（Dynamic Hierarchical Sequence，DHS），该索引提取数据图中带有顶点编号的层次拓扑序列关系以实现子图查询；针对图的动态变化，提出了更新点拓扑扩展式索引维护策略，仅从局部变化顶点及边开始进行增量式更新，大大降低了重建索引造成的巨大开销；提出了基于DHS索引的子图查询方法，仅需将查询图与数据图的层次序列进行匹配即可获得候选集，并在其上利用关系匹配策略获得最终查询结果。实验证明提出的方法在保证高效查询的同时降低了索引的创建及维护时间，提高了子图查询效率。相似文献

20.

不确定图间α-β子图同构匹配算法

张一楠邹兆年李建中《电脑学习》2011,(3):1-3,8

子图查询返回图数据集合中所有包含查询图的数据图。在查询图和数据图同时为不确定性图的前提下,提出了不确定图间的期望子图同构定义和α-β子图同构匹配定义。不确定图间的期望子图同构是确定图上子图同构在概率图模型上的直接推广,不确定图间α-β子图同构利用两个限制阈值来衡量查询图和数据图间的匹配质量。文章详细阐述了α-β子图同构匹配的语义特点,分析了其和期望子图同构的联系和差别,设计实现α-β子图同构匹配判定算法。相似文献