首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Graph matching and graph edit distance have become important tools in structural pattern recognition. The graph edit distance concept allows us to measure the structural similarity of attributed graphs in an error-tolerant way. The key idea is to model graph variations by structural distortion operations. As one of its main constraints, however, the edit distance requires the adequate definition of edit cost functions, which eventually determine which graphs are considered similar. In the past, these cost functions were usually defined in a manual fashion, which is highly prone to errors. The present paper proposes a method to automatically learn cost functions from a labeled sample set of graphs. To this end, we formulate the graph edit process in a stochastic context and perform a maximum likelihood parameter estimation of the distribution of edit operations. The underlying distortion model is learned using an Expectation Maximization algorithm. From this model we finally derive the desired cost functions. In a series of experiments we demonstrate the learning effect of the proposed method and provide a performance comparison to other models.  相似文献   

2.
Graph similarity is an important notion with many applications. Graph edit distance is one of the most flexible graph similarity measures available. The main problem with this measure is that in practice it can only be computed for small graphs due to its exponential time complexity. This paper addresses the high complexity of graph edit distance computations. Specifically, we present CSI_GED, a novel edge-centric approach for computing graph edit distance through common sub-structure isomorphisms enumeration. CSI_GED utilizes backtracking search combined with a number of heuristics to reduce memory requirements and quickly prune away a large portion of the mapping search space. Experiments show that CSI_GED is highly efficient for computing graph edit distance; it outperforms the state-of-the-art methods by over three orders of magnitude. It also shows that CSI_GED scales the computation gracefully to larger and distant graphs on which current methods fail to run. Moreover, we evaluated CSI_GED as a stand-alone graph edit similarity search query method. The experiments show that CSI_GED is effective and scalable, and outperforms the state-of-the-art indexing-based methods by over two orders of magnitude.  相似文献   

3.
Graph edit distance from spectral seriation   总被引:3,自引:0,他引:3  
This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems.  相似文献   

4.
魏征  汤进  江波  罗斌 《计算机应用》2013,33(1):44-48
图结构的特征提取及相似性度量是计算机视觉和模式识别中的重要研究内容。针对传统的方法对存在非刚性变换的图结构难以充分描述这一问题,给出一种基于图的上下文(GC)描述子的图结构信息描述及距离度量方法。首先,通过对图的边缘进行等距离散取样得到该图的采样点集;其次,基于图的采样点集给出图的上下文描述子;最后,采用推广的推土机距离(EMD)方法实现图的上下文描述子的距离度量。不同于图的编辑距离计算方法,所提方法不需要定义代价函数。实验表明该方法对于一些非刚性变换前后的图的距离计算具有较好的效果。  相似文献   

5.
A binary linear programming formulation of the graph edit distance   总被引:2,自引:0,他引:2  
A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, provided the cost function for individual edit operations is a metric. Then, a binary linear program is developed for computing this graph edit distance, and polynomial time methods for determining upper and lower bounds on the solution of the binary program are derived by applying solution methods for standard linear programming and the assignment problem. A recognition problem of comparing a sample input graph to a database of known prototype graphs in the context of a chemical information system is presented as an application of the new method. The costs associated with various edit operations are chosen by using a minimum normalized variance criterion applied to pairwise distances between nearest neighbors in the database of prototypes. The new metric is shown to perform quite well in comparison to existing metrics when applied to a database of chemical graphs.  相似文献   

6.
Graphs have been widely used for complex data representation in many real applications, such as social network, bioinformatics, and computer vision. Therefore, graph similarity join has become imperative for integrating noisy and inconsistent data from multiple data sources. The edit distance is commonly used to measure the similarity between graphs. The graph similarity join problem studied in this paper is based on graph edit distance constraints. To accelerate the similarity join based on graph edit distance, in the paper, we make use of a preprocessing strategy to remove the mismatching graph pairs with significant differences. Then a novel method of building indexes for each graph is proposed by grouping the nodes which can be reached in k hops for each key node with structure conservation, which is the k-hop tree based indexing method. As for each candidate pair, we propose a similarity computation algorithm with boundary filtering, which can be applied with good efficiency and effectiveness. Experiments on real and synthetic graph databases also confirm that our method can achieve good join quality in graph similarity join. Besides, the join process can be finished in polynomial time.  相似文献   

7.
A survey of graph edit distance   总被引:1,自引:0,他引:1  
Inexact graph matching has been one of the significant research foci in the area of pattern analysis. As an important way to measure the similarity between pairwise graphs error-tolerantly, graph edit distance (GED) is the base of inexact graph matching. The research advance of GED is surveyed in order to provide a review of the existing literatures and offer some insights into the studies of GED. Since graphs may be attributed or non-attributed and the definition of costs for edit operations is various, the existing GED algorithms are categorized according to these two factors and described in detail. After these algorithms are analyzed and their limitations are identified, several promising directions for further research are proposed.  相似文献   

8.
Graph edit distance is a powerful and flexible method for error-tolerant graph matching. Yet it can only be calculated for small graphs in practice due to its exponential time complexity when considering unconstrained graphs. In this paper we propose a quadratic time approximation of graph edit distance based on Hausdorff matching. In a series of experiments we analyze the performance of the proposed Hausdorff edit distance in the context of graph classification and compare it with a cubic time algorithm based on the assignment problem. Investigated applications include nearest neighbor classification of graphs representing letter drawings, fingerprints, and molecular compounds as well as hidden Markov model classification of vector space embedded graphs representing handwriting. In many cases, a substantial speedup is achieved with only a minor loss in accuracy or, in one case, even with a gain in accuracy. Overall, the proposed Hausdorff edit distance shows a promising potential in terms of flexibility, efficiency, and accuracy.  相似文献   

9.
在图相似性搜索问题中,图编辑距离是较为普遍的度量方法,其计算性能很大程度上决定了图相似性搜索算法的性能。针对传统图编辑距离算法中存在的因大量冗余映射和较大搜索空间导致的性能低下问题,提出了一种改进的图编辑距离算法。该算法首先对图中顶点进行等价划分,以此计算映射编码来判断等价映射;然后定义映射完整性更新等价映射优先级,选出主映射参与扩展;其次,设计高效的启发式函数,提出基于映射编码的下界计算方法,快速得到最优映射。最后,将改进的图编辑距离算法扩展应用于图相似性搜索。在不同数据集上的实验结果表明,该算法具有更好的搜索性能,在搜索空间上最大可降低49%,速度提升了约29%。  相似文献   

10.
段瑞 《计算机应用研究》2020,37(4):1049-1053
为了提高从企业模型库中查询检索模型的效率,提出一种基于变迁图编辑距离的流程相似性算法。首先,给出了变迁图的概念及其生成方法;其次,提出边的长度概念,且删除和插入边的代价由该边的长度决定,基于此定义出图编辑操作及其代价,并用节点匹配算法计算最小图编辑距离;然后,给出两个过程模型的相似性概念和计算方法;最后,通过实验验证了算法的正确性且满足七条相似性性质,并验证了变迁图编辑距离满足四条距离性质。  相似文献   

11.
The spectrum of a graph has been widely used in graph theory to characterise the properties of a graph and extract information from its structure. It has also been employed as a graph representation for pattern matching since it is invariant to the labelling of the graph. There are, however, a number of potential drawbacks in using the spectrum as a representation of a graph. Firstly, more than one graph may share the same spectrum. It is well known, for example, that very few trees can be uniquely specified by their spectrum. Secondly, the spectrum may change dramatically with a small change structure.There are a wide variety of graph matrix representations from which the spectrum can be extracted. Among these are the adjacency matrix, combinatorial Laplacian, normalised Laplacian and unsigned Laplacian. Spectra can also be derived from the heat kernel matrix and path length distribution matrix. The choice of matrix representation clearly has a large effect on the suitability of spectrum in a number of pattern recognition tasks.In this paper we investigate the performance of the spectra as a graph representation in a variety of situations. Firstly, we investigate the cospectrality of the various matrix representations over large graph and tree sets, extending the work of previous authors. We then show that the Euclidean distance between spectra tracks the edit distance between graphs over a wide range of edit costs, and we analyse the accuracy of this relationship. We then use the spectra to both cluster and classify the graphs and demonstrate the effect of the graph matrix formulation on error rates. These results are produced using both synthetic graphs and trees and graphs derived from shape and image data.  相似文献   

12.
As a learning method of heterogeneous graph representation, heterogeneous graph neural networks can effectively extract complex structural and semantic information from heterogeneous graphs, and perform excellently in node classification and link prediction tasks to provide strong support for the representation and analysis of knowledge graphs. Due to the existence of some noisy interactions or missing interactions in the heterogeneous graphs, the heterogeneous graph neural network incorporates erroneous neighbor features, thus affecting the overall performance of the model. To solve the above problems, in this paper we proposes a heterogeneous graph structure learning model enhanced by multi-view contrast. Firstly, the semantic information in the heterogeneous graph is maintained by the meta-path, and the similarity graph is generated by calculating the feature similarity among the nodes under each meta-path, which is fused with the meta-path graph to optimize the graph structure. By contrasting the similarity graph and meta-path graph as multiple views, the graph structure is optimized without supervision information, and the dependence on supervision signals is eliminated. Finally, for addressing the problem that the learning ability of the neural network model is insufficient at the initial training stage and there are often erroneous interactions in the generated graph structure, we design a progressive graph structure fusion method. Through incremental weighted addition of meta-path graphs and similarity graphs, the weight of similarity graphs in the fusion is changed. This not only prevents erroneous interactions from being introduced in the initial training stage but also achieves the purpose of employing the interactions in similarity graphs to suppress interference interactions or complete missing interactions, which leads to the optimized heterogeneous structure. Meanwhile, node classification and node clustering are selected as the verification tasks of graph structure learning. The experimental results on four real heterogeneous graph datasets prove that the proposed learning method is feasible and effective. Compared with the optimal comparison model, the performance of this model has been significantly improved under both tasks.  相似文献   

13.
Graphs are widely used to model complicated data semantics in many applications in bioinformatics, chemistry, social networks, pattern recognition, etc. A recent trend is to tolerate noise arising from various sources such as erroneous data entries and find similarity matches. In this paper, we study graph similarity queries with edit distance constraints. Inspired by the $q$ -gram idea for string similarity problems, our solution extracts paths from graphs as features for indexing. We establish a lower bound of common features to generate candidates. Efficient algorithms are proposed to handle three types of graph similarity queries by exploiting both matching and mismatching features as well as degree information to improve the filtering and verification on candidates. We demonstrate the proposed algorithms significantly outperform existing approaches with extensive experiments on real and synthetic datasets.  相似文献   

14.
Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching   总被引:1,自引:0,他引:1  
In a way similar to the string-to-string correction problem, we address discrete time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of edit operations needed to transform one time series into another. To define the edit operations, we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call time warp edit distance (TWED). TWED is slightly different in form from dynamic time warping (DTW), longest common subsequence (LCSS), or edit distance with real penalty (ERP) algorithms. In particular, it highlights a parameter that controls a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to link the matching of time series into down sampled representation spaces to the matching into the original space. The empiric quality of the TWED distance is evaluated on a simple classification task. Compared to edit distance, DTW, LCSS, and ERP, TWED has proved to be quite effective on the considered experimental task.  相似文献   

15.
The concept of graph edit distance constitutes one of the most flexible graph matching paradigms available. The major drawback of graph edit distance, viz. the exponential time complexity, has been recently overcome by means of a reformulation of the edit distance problem to a linear sum assignment problem. However, the substantial speed up of the matching is also accompanied by an approximation error on the distances. Major contribution of this paper is the introduction of a transformation process in order to convert the underlying cost model into a utility model. The benefit of this transformation is that it enables the integration of additional information in the assignment process. We empirically confirm the positive effects of this transformation on five benchmark graph sets with respect to the accuracy and run time of a distance based classifier.  相似文献   

16.
针对从业者不论是想从本地模型库还是线上共享网站获取所需的BIM 模型只能靠 逐个查找、人工识读的方法,而模型的数量越来越多,获取符合需求的模型需要花费大量的时 间和人力的问题,提出了一种构件级BIM 模型相似度计算方法。从模型的构件出发,以BIM 通用交互格式工业基础类(IFC)文件作数据源,以通用数据标准IFC 2×3 为数据基础,首先提取 模型中构件的几何信息、语义信息等,并利用改进的方向包围盒(OBB)碰撞检测算法查找相连 构件;然后以构件为顶点、构件间连接关系为边将BIM 模型构建为邻接图模型,并用图编辑距 离算法计算邻接图模型的编辑距离;最后即可计算出不同模型之间的相似度。该方法以构件级 BIM 模型的相似度为依据可以大大提升BIM 模型的检索速度与准确率。  相似文献   

17.
邴睿  袁冠  孟凡荣  王森章  乔少杰  王志晓 《软件学报》2023,34(10):4477-4500
异质图神经网络作为一种异质图表示学习的方法,可以有效地抽取异质图中的复杂结构与语义信息,在节点分类和连接预测任务上取得了优异的表现,为知识图谱的表示与分析提供了有力的支撑.现有的异质图由于存在一定的噪声交互或缺失部分交互,导致异质图神经网络在节点聚合、更新时融入错误的邻域特征信息,从而影响模型的整体性能.为解决该问题,提出了多视图对比增强的异质图结构学习模型.该模型首先利用元路径保持异质图中的语义信息,并通过计算每条元路径下节点之间特征相似度生成相似度图,将其与元路径图融合,实现对图结构的优化.通过将相似度图与元路径图作为不同视图进行多视图对比,实现无监督信息的情况下优化图结构,摆脱对监督信号的依赖.最后,为解决神经网络模型在训练初期学习能力不足、生成的图结构中往往存在错误交互的问题,设计了一个渐进式的图结构融合方法.通过将元路径图和相似度图递增地加权相加,改变图结构融合过程中相似度图所占的比例,在抑制了因模型学习能力弱引入过多的错误交互的同时,达到了用相似度图中的交互抑制原有干扰交互或补全缺失交互的目的,实现了对异质图结构的优化.选择节点分类与节点聚类作为图结构学习的验证任务,在4种...  相似文献   

18.
目前关于XML文档相似性算法有很多种,其中基于编辑距离的方法是很重要的一类。目前已发表的基于编辑距离的算法中,编辑图算法由于其计算高效率的特点成为研究的出发点。首先介绍了编辑图算法的思想,由于它在计算过程中对同层兄弟节点的顺序有很强的依赖性,因此不能准确有效地比较数据无序的数据中心的XML文档相似性。针对该问题,在编辑图算法思想的基础上,结合路径算法的思想提出拆分编辑图算法。实验结果表明,拆分编辑图算法降低了编辑图算法中对兄弟节点次序的依赖性,更适合于数据中心的XML文档相似性比较,而且所得结果更加准确有效。  相似文献   

19.
《Pattern recognition letters》1999,20(11-13):1271-1277
The computation of generalized median graphs (the graph with the smallest average edit distance to all graphs in a given set of graphs) is highly computationally complex. As a matter of fact, it is exponential in the number of nodes of the union of all graphs under consideration. Thus, the generalized median graph computation problem seems to be a suitable and challenging testbed for a comparison of combinatorial search and genetic algorithms. Two solutions are described in this paper. The first is an exact algorithm based on combinatorial search, while the second is a genetic algorithm. Both approaches are compared to each other in a series of experiments.  相似文献   

20.
Finding efficient, effective ways to compare graphs arising from recognition processes with their corresponding ground-truth graphs is an important step toward more rigorous performance evaluation.In this paper, we examine in detail the graph probing paradigm we first put forth in the context of our work on table understanding and later extended to HTML-coded Web pages. We present a formalism showing that graph probing provides a lower bound on the true edit distance between two graphs. From an empirical standpoint, the results of two simulation studies and an experiment using scanned pages show that graph probing correlates well with the latter measure. Moreover, our technique is very fast; graphs with tens or hundreds of thousands of vertices can be compared in mere seconds. Ease of implementation, scalability, and speed of execution make graph probing an attractive alternative for graph comparison.Received: 1 October 2002, Accepted: 15 January 2003, Published online: 6 February 2004Correspondence to: D. Lopresti  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号